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Preface 


The primary purpose of this book is to give a satisfactory introduction 
to those parts of classical theoretical physics for which a background in 
mechanics and electrodynamics must be assumed. The level and selection 
of the material should make it suitable as a senior or beginning graduate 
text for physics majors. I have, however, constantly tried to keep in 
mind the needs of the large and growing group for whom a knowledge of 
this material is continually becoming more necessary. The general and 
increasing use of solid state devices in engineering makes a familiarity 
with the thermal properties of materials as described by thermodynamics 
and statistical mechanics an essential part of the training of the modern 
engineer. Chemists and metallurgists have really always needed a strong 
background in these fields, and I hope that the point of view of the presen- 
tation I have given will be particularly useful for their general needs as 
well. 

The first part deals with special relativity which is developed from con- 
siderations of electromagnetic phenomena in moving systems and the 
question of whether the inertial system of mechanics is also the correct 
one for electrodynamics. The formulation of mechanics and electromag- 
netic theory in 4-vector and tensor form is then developed so that the 
naturalness and suitability of this description can be emphasized. 

In the next part, thermodynamics is developed as an empirical and 
macroscopic subject. The laws are presented in their “positive” form of 
definite statements about the existence and properties of state functions 
rather than in the “‘negative’’ manner of describing the impossibility of 
certain mechanisms. It has been my experience that students obtain a 
much better understanding of thermodynamic methods and philosophy 
when this approach is used, and it also becomes much easier for them to 
appreciate the connection between the various potentials and the choice 
of independent variables. They also are less prone to develop the attitude 
that thermodynamics is a branch of physics whose approach and detailed 
methods are mysterious and foreign as compared to the rest of theoretical 
physics. The third law is developed and its principal consequences 
obtained. In the last chapter of this part, I have discussed phase transi- 
tions in single component systems and have included consideration of 
the characteristics of second and higher order transitions. 
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The brief treatment of the kinetic theory of gases introduces in a graphic 
way the ideas of probability and distributions as essential features of the 
calculation of macroscopic properties as averages of molecular properties. 
Transport processes in non-equilibrium situations are also considered. 

The discussion of statistical mechanics which follows is based on the 
concept of the ensemble as the suitable vehicle for general probability 
calculations. The considerations are initially based on classical phase 
space and Hamiltonian systems since this approach involves familiar 
concepts; the introduction of quantum states is made later by correlating 
them with elementary cells in phase space. A brief survey of the calcula- 
tion of the virial coefficients for a real gas is given as an illustration of the 
use of the Boltzmann distribution and of equating time and ensemble 
averages. Chapters on semiconductors as well as on fluctuations and 
noise have been included because of the importance of these subjects for 
communications and laboratory measurements and because they provide 
good examples of the applications of general statistical mechanical results 
to specific situations. 

The physics preparation I have assumed is classical mechanics including 
Hamilton’s equations and normal modes of coupled systems and electro- 
magnetism expressed by Maxwell’s equations. I have also assumed a 
mathematical background of calculus through partial differentiation; 
what littte else in mathematical methods has been needed is developed in 
the text. Most of the exercises will be familiar to those acquainted with 
the subject and, of course, this is one of the reasons why these time- 
tested ones were chosen. The symbols =, ~, ©, ~, # always mean, 
respectively: equal to, approximately equal to, of the order of magnitude 
of, proportional to, and different from. 

Much of the organization and content of this book was developed over 
a period of years in connection with courses J taught at the U.S. Naval 
Ordnance Laboratory for the University of Maryland and later at the 
University of Arizona, and I am indebted to the many students and others 
who have contributed with their questions and comments. I am grateful 
again to my wife, Cleo Abbott Wangsness, for her encouragement and 
invaluable aid in all phases of the preparation of this book. 


ROALD K. WANGSNESS 


Tucson, Arizona 
October, 1963 
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Introduction 


What you have inherited from your fathers 
You must earn, in order to possess. 


—Goethe, Faust 


The development of new physical concepts and theories is a continual 
process and usually proceeds by means of refinement and revision of the 
old. It is also generally true that the earlier theories rarely disappear 
completely as a result of such revisions but are present as limiting cases 
of a more comprehensive theory. Very often the basic physical concepts 
which are used in the new theory become so broadened in scope as to 
encompass an astonishing diversity of phenomena. 

The special theory of relativity provides the classic example of how 
the growing inconsistencies with experiment of a previously highly 
successful theoretical scheme prompted a re-examination of the basic 
foundations of physical theory. These investigations had profound and 
far-reaching consequences for physics, and yet the terminology and 
conceptual notions in terms of which the special theory is formulated are 
ones such as momentum, energy, charge, and fields which are familiar 
from the theories of mechanics and electrodynamics for which they were 
originally devised. 

Thermodynamics, which had its origins in the practical problems of 
developing more efficient machinery for the Industrial Revolution, was 
soon developed into a macroscopic theory of great generality and power. 
In its “‘purest’’ state, thermodynamics deals only with empirical informa- 
tion, and its independence of models of the structure of matter gives its 
basic generalizations or “‘laws’’ a sense of universality and permanency 
possessed by no other portion of physics. The interpretation of thermo- 
dynamic properties in terms of molecular properties is the task of kinetic 
theory and statistical mechanics. The basic use of probabilities and 
averages in statistical mechanics has provided new insight into the 
microscopic significance of the laws of thermodynamics; yet the 
original conceptions were general enough that the transition from classi- 
cal mechanics to quantum mechanics as a basic description of matter 
was made with comparative ease and required no drastic changes in the 
methodology of statistical mechanics. 
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Selected references 


Throughout this book a background in classical mechanics and electro- 
dynamics is assumed which is equivalent to the content of 


R. K. Wangsness, Introduction to Theoretical Physics: Classical Mechanics 
and Electrodynamics, Wiley, 1963. 


Specific references to that book are indicated by the notation (I: 3-13) 
which in this example is to equation (3-13), the same reference system 
being used in both that book and this one. 

A thorough discussion of both special and general relativity can be 
found in 


C. Meller, The Theory of Relativity, Oxford, 1952. 


The following book covers many similar topics at about the same 
level as we do: 


A. Sommerfeld, Thermodynamics and Statistical Mechanics, Academic 
Press, 1956. 


An outstanding discussion of the fundamentals is contained in 
R. C. Tolman, The Principles of Statistical Mechanics, Oxford, 1938. 


The following six books provide detailed and extensive coverage of 
their subjects. They include many problems and references. 


M. W. Zemansky, Heat and Thermodynamics (4th ed.), McGraw-Hill, 
1957. 

A. B. Pippard, The Elements of Classical Thermodynamics, Cambridge, 
1957. 

H. B. Callen, Thermodynamics, Wiley, 1960. 

R. D. Present, Kinetic Theory of Gases, McGraw-Hill, 1958. 

D. ter Haar, Elements of Statistical Mechanics, Rinehart, 1954. 

T. L. Hill, An Introduction to Statistical Thermodynamics, Addison- 
Wesley, 1960. 


A brief introduction to a great many examples of statistical con- 
siderations in physics can be found in 


C. Kittel, Elementary Statistical Physics, Wiley, 1958. 


Introduction 3 


Mathematical methods and results appropriate to the material in this 
book are well presented in 


H. Margenau and G. M. Murphy, The Mathematics of Physics and 
Chemistry (2nd ed.), Van Nostrand, 1956. 

P. M. Morse and H. Feshbach, Methods of Theoretical Physics (2 vols.), 
McGraw-Hill, 1953. 


Part One 


Special Relativity 


I The experimental basis for special relativity 


The basic concern of the theory of relativity is the comparison of results 
obtained by observers of physical phenomena who are moving with 
respect to each other. The special theory, which is the only form we shall 
consider, refers only to the case of two observers who have a constant 
velocity relative to one another. In order to illustrate some of the 
concepts involved in the general problem, we begin by considering how 
two moving observers would describe some electromagnetic effects. 


1-1 Electromagnetic fields in moving coordinate systems 


The vector E in the electromagnetic field equations represents the 
electric field at a given point in space (z, y, z) and at a given time ¢, all of 
which are specified in terms of a particular set of coordinate axes. The 
person who is observing and describing the phenomena is regarded as at 
rest with respect to these axes. 

Suppose now that the same phenomena are to be observed and described 
by someone else who wishes to use coordinate axes which are moving 
with respect to the first set. We represent the coordinates and time 
referred to this second set of axes by 2’, y’, z’, t’. The second observer is 
at rest with respect to his set of axes, and it is assumed to be just as possible 
to describe the phenomena from his point of view as from the point of 
view of the first observer. Then there naturally arises the question: Will 
the two observers describe things in the same way or will their descriptions 
differ in any essential respects ? 

The simplest situation of this kind is that in which the two systems, 
S and S’, are moving with constant relative speed v along the common 
direction of their x and 2’ axes as shown in Fig. 1-1. The usual formulas 
relating the two sets of coordinates and time for a given “‘event’’ are those 
of the classical (or Galilean) transformation (I: 3-13). Since the only 
component of the relative velocity is v, =v, these transformation 
equations become 


a=x—v, yy =y, 2 =2, t=t (1-1) 
if we also assume for simplicity that the two origins O and O’ coincide 


at t= 0. If one of these sets of axes is an inertial system, then we know 
from (I: Sec. 3-5) that the other set is also an inertial system. We know 
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z z’ 


Fig. 1-1 


too that these two sets of axes satisfy the classical (or Galilean) principle 
of relativity in that they are both equally good for use in mechanics since 
the transformation (1-1) ensures that the laws of motion have the same 
form in all inertial systems. We now want to see whether the fields at a 
given point appear the same or differently to the two observers. 

Let us assume that the observer at rest in system S finds that at a given 
point P there is no electric field but that there is a magnetic induction B. 
In principle he would learn this by placing a stationary electric charge g 
at the point P and then noting that the charge experienced no acceleration. 
On the other hand, if he moves the charge with velocity v, he finds that it 
is acted on by the force qv x B. 

Now let the observer associated with and at rest in S’, which has a 
velocity v with respect to S, perform a similar experiment by placing at 
the point P a charge g which is at rest. Since q is at rest with respect to 
the axes S’, the charge is therefore moving with velocity v with respect to 
the axes S. The observer in S would see this charge accelerated by the 
force qv x B, and the observer in S’ must also observe exactly the same 
acceleration according to (I: 3-15) since both systems are inertial systems. 

Since the S’ observer finds that the charge which is at rest as far as he 
is concerned is accelerated, he will conclude that the corresponding force 
must be written gE’, according to (I: 19-34). Thus he decides that there 
is an electric field at P, and, if we equate the accelerations, we find that 
this field is given by 

E’=vxB (1-2) 


Thus what appeared to one observer as a magnetic field would be inter- 
preted as an electric field by the other observer. 
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The field given by (1-2) can be expected to be in addition to any electric 
field E already present in S. If we assume that we can simply add (1-2) to 
E, we can say that the fields as seen by the two observers are connected 
by the transformation equation 


E’'=E+vxB (1-3) 


In this equation E’ is the electric field with reference to axes S’, while E 
and B are the electric field and. magnetic induction with reference to the 
axes S. It is to be emphasized that these fields are measured at the same 
point and at the same time; the difference is entirely due to the different 
systems of axes to which the observers’ descriptions are referred. We 
shall find later that (1-3) is not quite exact but is a very good approxi- 
mation when |v| < c. 

The part of the electric field given by (1-2) is called a motional electric 
field. One can often use (1-3) to express electromagnetic properties of a 
moving medium in terms of the fields referred to a stationary coordinate 
system. For example, if the moving medium can be described by a 
conductivity o and an electric susceptibility y7,, the current density and 
polarization in the medium will be given by 


= of’ = o(E + v x B) (1-4) 
P= €oX eK = €oX(E +Vvx B) (1-5) 


As another example of the application of (1-2), we can consider the 
problem of finding the emf induced when a conductor is moving through 
a magnetic induction where, as in (I: 19-65), the emf is the line integral of 
E around the complete circuit. We shall consider the simple circuit 
shown in Fig. 1-2. Two straight wires w, and w, are parallel to the x axis 
in the xz plane and are a distance / apart. They are connected to each 
other at one end through a meter A which will measure the current in the 
circuit. Another wire W rests on w, and w, and is parallel to the z axis. 


y B 
Ww) 


w2 


Fig. 1-2 
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This whole circuit is in a uniform induction B which is parallel to the y 
axis. If Wis moved with constant speed v in the direction of increasing 
x, a current is observed in the circuit. When the current is measured, the 
emf can be found as the product of the current and the resistance. We 
want to calculate this emf by using the transformation equations for the 
fields. 

There is no electric field along the wires w, and w., and the connection 
through A; hence they make no contribution to the emf. From the point 
of view of an observer moving with the wire W there is an electric field 
E,’ = vB, according to (1-2), since v and B are perpendicular. Therefore, 
from (I: 19-65), the total emf in this circuit is 


$ E'- ds = E,'W = BuW (1-6) 


This result agrees with experiment and is also compatible with the integral 
form of Faraday’s law of induction (I: 19-66) for this case in which the 
flux is changing only because the area is changing. 


1-2 Maxwell’s equations and moving coordinate systems 


We have just seen that the electric field at a point is not an absolute 
property of the point but depends on the system of axes to which the 
observations are referred. The question whether magnetic fields have a 
similar dependence cannot be answered by the use of simple arguments 
like those of the last section, since the law of force, F = q(E + v x B), 
contains no term that indicates a force acting on a current element which 
is moving in an electric field. Nevertheless, later we shall find by other 
means that there is for B a transformation equation of a type similar to 
(1-3). 

Since the expressions for the fields change according to the coordinate 
system one is using, one naturally wonders about the possible effect of a 
transformation between coordinate systems on the basic equations 
describing the fields, that is, on Maxwell’s equations. Rather than in- 
vestigate this question for the whole set of Maxwell’s equations, it is 
sufficient for our purposes to consider only one of the consequences of 
Maxwell’s equations—the existence of electromagnetic waves. In free 
space, according to (I: 30-1), the fields satisfy the wave equation 


au tu, au _1 aU 


ax | ay? ae? OF oe) 
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where U can be any of the components of the electric or magnetic fields. 
As we have seen, (1-7) has solutions in the form of plane waves whose 
speed of propagation is c. We shall now show that the form of this 
equation is not preserved under the Galilean transformation (1-1); in 
order to do this we shall want to express (1-7) in terms of the primed 
variables. 

For simplicity, let us confine ourselves to the one-dimensional form of 
the wave equation: 


ar (1-8) 


Using (1-1), we see that, if we now regard U as a function of z’ and ?’, 
then 

aU _ aU ae’ | aU ar’ _ aU 

Oz Ox'dx Ot’ Ox_—sA' 


so that 
eu @ 
aa? ax” vo 
Similarly, 
aU _ Uae’ , Ua _ _ aU , aU 
Ot Ox’ ot oat’ at Ox’ at’ 
2 2 2 2 
OU _ 20U _ 4, g°U o°U (1-10) 


v + 
or” Ox'® Ox' or’ ar” 
When (1-9) and (1-10) are substituted into (1-8), we find that the wave 
equation becomes 


2 
(? —-v)— = — - (1-11) 
which is certainly different from the equation (1-8) for the unprimed axes. 
Let us try to find a plane wave solution of (1-11) in the form 


U = Ue) (1-12) 


where U, is a constant amplitude. Substituting (1-12) into (1-11), we find 
that we must have V = +c — v so that the magnitude of the phase 
velocity of the wave is given by 


lVi=ctov (1-13) 
and is no longer simply c as it was in the unprimed coordinate system 
according to (1-8). 

Thus we see that, if (1-1) and (1-8) are simultaneously valid, electro- 
magnetic effects would not be the same if they were observed from different 
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reference systems moving with a constant velocity relative to one another, 
and, in particular, the speed of propagation of a plane wave in a vacuum 
would not retain its value c. In other words, since (1-7) is a consequence 
of Maxwell’s equations, there would exist only one frame of reference in 
which Maxwell’s equations have the form in which we have been writing 
them and in which electromagnetic waves have the speed c. This preferred 
reference frame was generally identified with the primary inertial system 
of mechanics; when this reference system was extended by assuming it to 
possess electromagnetic properties so that it could also serve as a medium 
for propagating waves, the augmented system was called the ether. 

On the other hand, we have seen that the concept of a special reference 
system is foreign to mechanics because of the existence of the Galilean 
principle of relativity, that is, the equivalence of reference systems moving 
with constant relative velocity. Since this concept of equivalence has a 
certain simplicity and attractiveness, we are led to consider the possibility 
that there may be a common principle of relativity (equivalence of co- 
ordinate systems) for both mechanics and electromagnetism. If this were 
the case, the Galilean principle of relativity might not actually be the 
correct one, even though it does hold for mechanics in the Newtonian 
formulation which we have been using. 

The proper choice among these possibilities can only be made on the 
basis of experimental results. Historically, one method of procedure was 
to assume that Maxwell’s equations were valid in the primary inertial 
system and were changed according to the scheme illustrated by (1-11) 
on going to another inertial system—the coordinate transformation being 
the classical one of (1-1). Thus one could expect noticeable effects due to 
the motion of the laboratory system with respect to the ether, that is, the 
primary inertial system. We shall illustrate this procedure by discussing 
two famous experiments which were performed in order to look for these 
electromagnetic effects. 


1-3 The Trouton-Noble experiment 


In this experiment a charged parallel plate condenser was suspended so 
that it could turn freely about an axis parallel to the plates. It was 
expected that the effect of the translational motion of the earth with 
respect to the ether would be a torque tending to align the condenser 
plates parallel to the direction of motion. In order to derive this expected 
effect, it will be convenient to simplify the problem by regarding the charges 
on the plates as concentrated into two point charges whose separation Is 
the same as that of the plates, as illustrated in Fig. 1-3. 
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Fig. 1-3 


The motion of the charge +g with the velocity v corresponds to the 
current element i ds = qv by (I: 19-32) and produces a magnetic induction 
B at the position of the negative charge. According to (I: 19-31), we have 


B= “od Xt 
4nr® 


so that the force on the negative charge —q is 


_ Hog'lv x (v x (1-14) 


f=-qvx B= 
7 Aor 


since the charge —gq also has the velocity v. The magnitude of this force 
iS 


24.8 ae 
fa ed v” sin 0 (1-15) 


rr? 


Similarly, there will be an equal and opposite force acting on the positive 
charge as a result of the motion of the negative charge. These two forces 
together produce a torque parallel to the vertical axis of suspension which 
tends to rotate the charges until the line connecting them (r) is perpen- 
dicular to v. The magnitude of the torque is 


oq'v* sin 20 _ U, (v 


N = frrcos 9 = Mote Sin 28 _ Ue ( 
frco 5 


Sar 


2 
sin 26 (1-16) 
c 
where U, = q?/4zreor is the electrostatic energy of the system, and where 
we have used fo€)9 = c-?. [The calculated result which takes into account 
the charge distribution of the parallel plate condenser is just twice the 
value given in (1-16).] 

Equation (1-16) shows us that this torque is of the order of magnitude 
of 3U,(v/c)?. Taking v as about equal to the speed of the earth in its 
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orbit, which is about 3 x 10‘ meters/second, one finds that the expected 
torque would be large enough to be observable. However, the most 
careful experiments failed to find any indication of this torque. Thus 
the result of the Trouton-Noble experiment is incompatible with the 
concept of a stationary ether as a preferred frame for electromagnetism. 


1-4 The Michelson-Morley experiment 


This experiment was based on the result (1-13) which essentially showed 
that the phase velocity of light in a given system was the sum of the 
velocity of the light with respect to the ether plus the velocity of the 
moving coordinate system with respect to the ether. In other words, 
(1-13) says that the phase velocity of the wave will depend on the motion 
of the medium. Since the orbital speed of the earth is so small compared 
to c, it is not feasible to make a direct measurement of the light velocity in 
various directions with respect to the earth’s surface in order to check on 
(1-13). It is possible, however, to compare these directions and look for 
this small effect by using the light itself in a particular manner; this idea 
had been suggested by Maxwell, and the experiment was first performed 
by Michelson and Morley in 1887. 

The ‘experimental arrangement is illustrated in Fig. 1-4. Light from the 
source L is incident upon the partially silvered glass plate M and divided 
into two beams which travel normal to each other. Each beam covers the 
distance /, to the mirrors M, and M, and is reflected back over its original 
path. Part of the light from each beam passes along the fourth arm 
toward a detector 7. Thus the beams have been recombined, and any 


Me 
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phase difference between them resulting from their journeys back and 
forth will produce interference effects observable as a variable amplitude 
in the superimposed beam. In order to calculate the phase difference, 
we can find the times taken by the light to travel along the arms. 

Assuming the arm | to be the direction of motion of the apparatus, 
then, according to (1-13), we find that the speed of the light is c — v on 
the initial path and c + v on the return path. Thus the time taken to pass 
to and fro along this arm is 


2\-1 2 
,=—* +—2 = (1 _ 2) ~ 7 (1 +5) (1-17) 
c—U c+ ov Cc Cc Cc Cc 


In the perpendicular direction along arm 2, the resultant relative light 
velocity is Vc? — v, and the corresponding time for the double journey is 


2 
peat g = (1-18) 


Vei—vt c 2c? 


Thus the two times are not equal, the difference being obtained from 
(1-17) and (1-18) as 


ne a i} (1-19) 


Cc \c 


Taking /, = 30 meters and v ~ 3 x 104 meters/second as before, we find 
from (1-19) that At = 10-!® second. This corresponds to a phase dif- 
ference 


2rv At = 


arene = 0.6(27) ~ 3.8 radians 


for visible light with wavelength of 0.5 micron. This is a sizable relative 
phase shift and could lead to detectable changes in the interference 
pattern as the apparatus is rotated to interchange the two arms. However, 
no such effects were observed, either in the original Michelson-Morley 
experiment or in subsequent experiments of a similar nature. Thus these 
optical experiments contradict the result (1-13) which followed from 
combining the concepts of a preferred frame for electromagnetism with 
the Galilean relativity principle for mechanics. 

There was one famous attempt to preserve the preferred ether frame, 
however. The Lorentz-Fitzgerald contraction hypothesis proposed that, 
for all bodies, motion relative to the stationary ether frame produced a 
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contraction in the dimensions parallel to the direction of motion by the 
factor V1 — (v?/c2). Accordingly, the length for arm | should actually be 
L= lov 1 — (v/c?), while /, = /, since arm 2 is at right angles to the 
direction of motion. The time ¢, given in (1-17) then becomes 


2\-1 2\—4 
Cc Cc 


so that At = 0 and there should be no effect, exactly as was observed. 
However, although the contraction hypothesis would explain the original 
Michelson-Morley experiment, it does not explain the subsequent Kennedy- 
Thorndike experiment which used an apparatus with arms of unequal 
length, /;) and /45. Taking account of the contraction, one can show that 


then 
2: 2\-4 
At = 2hf10 = 20) (1 = *) (1-20) 


c ¢ 


which was not observed either. 

These experiments contradict the concept of a preferred reference frame 
for electromagnetism. Accordingly, we are led to assume that there is a 
common principle of relativity for mechanics and electromagnetism, but 
it cannot be that corresponding to the transformation expressed by (1-1). 
The concepts which we introduce to replace those we have used up to now 
are those supplied by the postulates of special relativity. 


Exercises 


1-1. A magnetic induction B is parallel to the axis of a cylinder of radius a 
and dielectric susceptibility x,. If the cylinder is rotated about its axis with an 
angular velocity w, show that the resultant polarization produced is P = 
€9x-~Br and that the charge per unit length which appears on the surface of the 
cylinder is given by 27a*egx,wB. 

1-2. A brass disk of radius a is mounted on an axle which is parallel to the 
uniform induction B. A current J enters the disk at a contact on the circum- 
ference and leaves the disk along the center of the axle. Show that the torque 
on this system, called a Faraday disk, is $/Ba?. 

1-3. Derive (1-20). 
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2 The basic postulates and the Lorentz 


transformation 


In the preceding chapter we have seen that experiment contradicts the 
concept of a preferred system for electromagnetism. We have also seen 
it to be plausible that the concept of relativity (equivalence of moving 
coordinate systems) holds for electromagnetism as well as for mechanics. 
In 1905 Einstein extended these ideas even further by proposing the 
equivalent of the two following postulates. 


POSTULATE ONE. All systems of coordinates are equally suitable for 
the description of physical phenomena. 


This postulate literally refers to all systems of coordinates, although we 
shall restrict ourselves to those systems which are moving with constant 
velocity relative to one another. When we use this restriction, we are 
discussing the special theory of relativity; an unrestricted discussion 
would correspond to the general theory of relativity. 

The postulate also says that there is a common principle of relativity 
for all of physics, although we shall be discussing it only in terms of 
mechanics and electromagnetism. Another way of stating the first 
postulate is to say that no theory can contain any reference whatsoever to 
an absolute speed of translational motion of the coordinate system which 
is being used. We saw in (1-13) that Maxwell’s equations combined with 
the Galilean transformation did indicate the possibility of effects due to an 
absolute velocity, but such a prediction is not borne out by experiment. 
This could mean that Maxwell’s equations are not exact but are only 
approximations, or it could mean that the Galilean transformation is not 
correct. Einstein preferred the second possibility; instead of postulating 
that Maxwell’s equations must be covariant (unchanged in form) to the 
proper transformation of coordinates, it is sufficient (and equivalent) to 
state: 


POSTULATE TWO. The speed of light in vacuum is the same for all 
these observers and is independent of the motion of the source. 


It is this second postulate which leads to the more unfamiliar results of 
the Einstein theory of relativity. The postulate implies that Newtonian 
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mechanics is not the correct form for the exact representation of mechani- 
cal phenomena, since the transformation (1-1) representing the relativity 
principle natural to mechanics is incompatible with this second postulate, 
as is illustrated by (1-13). First it will be necessary for us to find the 
transformation rules for the coordinates which will satisfy the second 
postulate; then we shall have to find such laws of mechanics as will 
satisfy the first postulate when we use these transformation laws. The 
actual applicability of these two postulates to the description of physical 
phenomena can then be appraised by seeing how well results deduced 
from them agree with experiment. 


2-1 The Lorentz transformation 


The transformation of coordinates which is quantitatively compatible 
with the second postulate is called the Lorentz transformation and is what 
we now want to find. We still consider two coordinate systems in relative 
motion. For a given point event the first observer will assign the spatial 
coordinates and time (2, y, z, t), and those obtained by the second observer 
will be (z’, y’, 2’, t'). Thus our transformation equations must give the 
position (z’, y’, z’) and the time ¢’ of the event if the first observer assigns 
the values (x, y,z) and ¢ to it. Although our derivation is somewhat 
oversimplified, it will lead to the correct result and will include the princi- 
pal features of a more elaborate and exact treatment. 

Suppose that, at the instant the two origins of the coordinate systems 
coincide, a small pulse of light is produced at the origin. The second 
postulate requires that both observers see the light propagating outward 
with the same speed c in all directions. Thus, with respect to his own 
coordinate system, each observer must see the wavefront as a sphere of 
radius equal to c times the time. The equations of the wavefront are, 
therefore, 

e+y+22— ctr =0 (2-1) 


2 py’? 4 2/2 — 272 = 0 (2-2) 
and we must satisfy the identity 
x2 + y? + 22 — c2f2 = a2 + y"? + 2/2 ‘es c2r’2 (2-3) 


An equivalent way of saying this is that the quantity on either side of (2-3) 
must be invariant with respect to the transformation leading from one 
system to the other. 

Again, for the sake of simplicity, we take the z and 2’ axes in the 


Part One. Special Relativity 19 


direction of the relative velocity of the systems S and S’ as in Fig. 1-1. 
First we make the fairly evident assumption that the transverse coordi- 
nates remain unchanged, that is, y = y’, 2 = 2’, so that (2-3) becomes 


xe? — ctf? = x’? — C2”? (2-4) 


Since there is only one relative velocity v, the transformation formulas 
must further fulfil the condition that the origin O’ has the coordin- 
ate vt in S and that O has the coordinate —vt’ in S’, since the two 
Origins coincided at t = t/ = 0. Thus we must have 

x = vt when x’ = 0 
(2-5) 
z’=-—vt' whenz=0 
as is also illustrated in Fig. 2-1. We shall 
try the simplest relations which will sat- 
isfy (2-5), that is, the linear equations 


x = (a — vt), x= y'(2' + vf’) (2-6) 


Fig. 2-1 


where y and y’ are constants to be deter- 
mined. We can express ¢’ in terms of the unprimed quantities by elimin- 
ating x’ from (2-6); the result is that 


oh) or 


Putting these expressions for x’ and t¢’ into (2-4) and rearranging, we 
obtain 


2,2 2,2 
Ld dec am” | leas Lee ear) 
v YY v YY 


+ t*[c(y* — 1) — y*v"] =0 (2-8) 


In order that (2-8) will always be zero for all possible values of x and 1, it 
is necessary that the coefficients of x, xt, and ¢* vanish separately. 
Equating the coefficient of t? to zero, we find that 


1 
Y ———————————— 
V1 — (v*/c*) 
We also find from the coefficient of zt and from (2-9) that y = y’. We 


can now easily verify that these values of y and y’ also make the coefficient 
of x? in (2-8) vanish. 


(2-9) 
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Combining all these results, we obtain the Lorentz transformation 
formulas 


z=yW(zr-—vt), y=y, 2=2z t= rf — . x) (2-10) 
c 


r=y(z' +t’) y=y, z2=2, t= aG + = x’ (2-11) 
Cc 


where y is given by (2-9), and the equations (2-11) can be obtained from 
(2-10) by solving for the unprimed quantities in terms of the primed ones. 

If we let c — o0 in (2-10), we obtain in the limit the Galilean trans- 
formation (1-1) which we now see as a first approximation to the Lorentz 
transformation for finite values of v. The results (2-10) and (2-11) also 
have the desired symmetry since, by interchanging primed and unprimed 
symbols and changing the sign of v, we go from one set to the other. 

Actually, any linear transformation of coordinates and time which 
satisfies (2-3) is called a Lorentz transformation. The one given in (2-10) 
is the particular one which applies only to the special case in which the 
relative motion of the systems is along their common z direction. We 
shall return to the consideration of more general Lorentz transformations 
later, but for the present we shall restrict ourselves to the one given in 
(2-10). 


Exercise 


2-1. Show by direct substitution of (2-10) into (1-8) that the Lorentz trans- 
formation preserves the form of the wave equation, that is, if 0?U/dz? = 
07U/c* at?, then 02U/ dx’? = 0?U/c? ar’. 


B Some kinematic consequences of the Lorentz 


transformation 


Now that we have obtained the Lorentz transformation equations, we 
want to see what sort of physical results are implied by them. We confine 
ourselves in this chapter to effects which are basically kinematical, that is, 
which involve only the geometrical and temporal descriptions of events 
given by our two observers. 
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We shall let 


b= 


Q 1S 
-~ 
rn 
bh 
el 


so that (2-10) becomes 
a =y(a— pet) yoy, 2=z t= (i - =) (3-2) 
c 


where 
1 


da ey 


3-1 Relativity of simultaneity 


Suppose that two events occur at the points z, and z, in the S system 
and are simultaneous so that t, = f,. The times at which these events are 
observed in S’ as obtained from (3-2) are 


Therefore the time interval between these events is 
At’ = te’ ee t,’ = PE es ==, 21) X 0 (3-5) 
Cc 


This result shows that two events which occur at different points x, and 
x», and which are simultaneous for an observer at rest in S, will no longer 
appear to be simultaneous to an observer at rest in S’, and who is therefore 
moving relative to S. In other words, simultaneity is not an absolute 
property of a pair of events but is also a function of the state of motion of 
the observer. 


3-2 The Einstein time dilation 
Suppose that we have a clock located at the point xz, and that it is 
emitting signals of some sort at regular intervals At, where 
At=t, —t, = ts — te, ete. (3-6) 


The corresponding times in the primed system are given by (3-4), except 
that nowz, = 2, since the clock is fixed in the unprimed system. Therefore, 
when seen from the system which is moving relative to the clock, we find 
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from (3-4) that these signals will be separated by the time intervals A?’ 
given by 
Ar!’ =yAt>At (3-7) 
since y > I. 
Thus the time interval appears to be longer to the moving observer than 
it does in the system in which the clock is at rest. 


3-3 The Lorentz contraction 


In principle, we measure a length by placing a measuring rod along the 
distance to be measured and then finding the difference between the scale 
marks which simultaneously coincide with the ends of the length of interest. 
This detailed specification, which seems trivial when the measuring rod 
and the length are relatively at rest, is essential when they are moving with 
respect to each other. 

We let L = x, — 2, be the distance between two points as found with a 
scale which is at rest with respect to them. The length as found by the 
moving observer who assigns coordinates x,’ and 2,’ to the ends is found 
from (3-2) to be 


L'=2,' — 2, = y[x, — 2 — Belt, — t)] (3-8) 


However, the points x,’ and x,’ must have been determined simultaneously 
in the S’ frame, so that ft,’ = ¢,'; this means that the time interval (t, — ¢,) 
appearing in (3-8) must be given by 


heaves (Geen e (2) (3-9) 


C 


according to (3-4). When (3-9) is substituted into (3-8) and (3-3) is used, 
we find that 


L=y1-p\L==lWi- p<. (3-10) 
i , 


Thus, as found by the moving observer, the length in the direction of 
motion will be contracted by the factor 1/y. 

Although we derived these quantitative results (3-5), (3-7), and (3-10) 
by means of the Lorentz transformation, it should be emphasized that it 
can be shown that these effects are qualitative consequences of the two 
postulates only, and thus the existence of these effects does not depend on 
the specific formulas given in (3-2) and used above. 
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3-4 KEinstein’s velocity addition formulas 


In Newtonian mechanics, which depends on the Galilean transformation, 
relative velocities are found by simple addition of components as is 
shown explicitly in (I: 3-14). In a mechanics which depends on the 
Lorentz transformation, the combining of velocities is somewhat more 


complicated. 
Suppose that a point has the velocity u’ as observed in S’, where 
'=u,t + uf + u,'k (3-11) 
and 


=—, | =a, &, |= a (3-12) 


We now want to find the components u,, u,, u, of the velocity of this same 
point when its motion is observed with respect to the system S. 
Differentiating (2-11) and (3-2), we find that 


dx (22 dt’ a) 
u, =S = —_—.-_—— C- 
t dt’ dt ne dt 


ae (1 2 P us) (3-13) 


so that, when these are combined and dr’/dt is eliminated, we obtain 


u,= y'(us + (1 a B u.) 
c 
which can be solved for u,, yielding 


u, + v 
= —_*____ 3-14 
1 + (vu,'/c*) Ch) 


We can now express the right side of (3-13) completely in terms of primed 
quantities by substituting (3-14); the result is that 


dt’ 1 


z 


= ———______ 3-15 
dt = y[1 + (vu,'/c*)] er 
Similarly we find from (2-11) and (3-15) that 
en in Se , (3-16) 
dt dt’ dt = yf{1+ (vu,'/c*)] 
U, as (3-17) 


~ yf + (vu,’/c)] 
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which are the remaining relations needed to convert velocity components 
observed in one system to those observed in another. 

These equations have the interesting property that the sum of two 
velocities can never exceed c. Let us consider an extreme case as an 
illustration: Suppose that the system S’ has relative speed v = c, while the 
velocity of the point with respect to S’ is also u,’ = c. According to the 
Galilean transformation, the resultant speed observed in S would be 
u, =u, +v=c+c=2c. The Einstein formula (3-14) gives, instead, 


ctec- 
u, = —.— =Cc 
1 + (c?/c*) 
as was asserted. 
We also note that, if B = v/e «1, then (3-14), (3-16), and (3-17) 
become approximately 


u,~u, +0, uy ~u, u,) 4, (3-18) 


so that the simple addition of components which follows from the Galilean 
transformation is an approximation which holds quite well for low relative 
velocities of the coordinate systems. 


Application. Index of refraction of moving media 


The classical experiment on the propagation of light in moving media 
was performed by Fizeau, who measured the small difference in the indices 
of refraction of stationary and flowing water by an interference method. 
It is clear that the flow velocity of the water cannot simply be added to the 
phase velocity of the light, as this would require the effect to be independ- 
ent of the density of the medium and thus result in a sharp discontinuity 
between the properties of an extremely tenuous flowing medium and 
those of a vacuum for which the effect cannot occur. Such a discontinuity 
is never Observed, and, in fact, the actual result of the experiment can be 
expressed in terms of the wave velocity u in the flowing medium as 


u=Uu,+ o(1 — “) (3-19) 

n 
where v is the flow velocity of the medium of index of refraction n, and 
Uy = c/n is the wave velocity in the stationary medium. The factor 
multiplying v is called the Fresnel coefficient; it approaches zero asn — 1, 
as we would expect. Although (3-19) can be derived from a theory based 
on the ether concept, it is most easily obtained from the relativistic formulas 
for combining velocities. 
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If we assume that v/c « 1 and neglect second order terms in v/c, we find 
from (3-14) that the phase velocity u found by an observer at rest in the 
stationary system is given by 


—~ YF __ Duo 
1 + (vu,/c?) se o( *) 


2 2 
=utov—-—2 — (: p= uo + 0(1 — “2 
c c 


Cc 


n 


which is exactly the experimental result in (3-19). 


3-5 The Doppler effect and aberration 


In this section we want to investigate how the frequencies and directions 
of propagation of a wave will appear to our two observers. Suppose that a 
light source Q’, which is at rest in S’, emits a spherical wave. According to 
(I: 32-5), we can represent this wave by 

U' = Ug eilk’r’—w't’) (3-20) 
r’ 
where U,’ is an amplitude factor and the frequency »’ is found from 
4 2 / 
eal aS (3-21) 
c c 

Suppose that there is an observer P at rest in S and located in the zy 
plane at the point (2, y), while the coordinates are x’ and y’ with respect to 
S’. As illustrated in Fig. 3-1, the direction of propagation from Q’ to P 


P 
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makes the angle 6’ with the zx’ axis so that 
r’=2' cos 6’ + y’' sin 0’ (3-22) 


The observer in S will interpret the wave as a spherical wave originating 
from a point source @Q in his system so that he must write the equation for 
the wave in the form 


ine Uo pilkr—ot) (3-23) 
where : 
ba ee (3-24) 
c c 
r=2zcos#+ysin 0 (3-25) 


Now the numerical value of the phase of the wave must be the same and 
independent of whatever system is used to express it; consequently, the 
values of the exponents in (3-20) and (3-23) must always be equal. In 
addition, the coordinates and the times of the two systems must be 
connected by the Lorentz transformation (3-2). Equating these exponents, 
substituting from (3-21), (3-22), (3-24), (3-25), and (3-2), we find that we 
must have 


»( Pees ysin6 _ ) = y | Rea Re ces + ysin6’ _ rf _ Pe) | 
c c c 
(3-26) 


This equality must hold for all possible values of 2, y, and ¢; this is 
possible only if the coefficients of these quantities are separately equal. 
Equating the coefficients of t on both sides of (3-26), we find that 


y= 'y(1 + B cos 6’) (3-27) 


which is the exact relativistic Doppler effect formula and relates the two 
frequencies. We note that, if 8 « 1, then y = 1, and (3-27) becomes the 
classical formula 


y~v(1 + B cos 6’) (3-28) 
Equating the coefficients of z on both sides of (3-26), we find that 
vy cos 0 = »'y(cos 6’ + B) (3-29) 
and, if we use (3-27) to eliminate the frequencies, we find that 


cos 0’ + B 
1 + Bcos 6’ 


cos 0 = (3-30) 
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which relates the two directions of propagation and is the aberration 
formula. 

The relativistic Doppler formula (3-27) differs from the acoustical 
formula (3-28) in that it predicts a transverse effect. Suppose that one 
observes the frequency at right angles to the motion, that is, 6 = 7/2. 
Then cos 6 = 0 and cos 0’ = —, according to (3-30). When this is 
substituted into (3-27) and (3-3) is used, we find that 


y= J/1 — BP (3-31) 


The effect predicted by (3-31) has been observed by studying the radiation 
from an electric discharge in hydrogen; it provides some experimental 
evidence for the basic correctness of the relativity postulates. 


3-6 Another point of view 


The Lorentz transformation holds for coordinate differentials, of 
course, so that from (3-2) we obtain 


dx’ = y(dx — Bcdt), dy' = dy 

B (3-32) 

dz’ = dz, dt'= »(a -- dz) 
c 


We also know from (2-3) that the Lorentz transformation corresponds 
to the equality 


(de)? + (dy)? + (de)? — c(dt)® = (de' + (dy’)? + (de)? — cat’)? 
(3-33) 


Therefore, dividing out (dt)? and (dt’)? and taking the square root, we find 
that the quantity dr given by 


anal SUUF« ( 


2 12 
=a,f1—“aar — (3-34) 
Cc C 


has the same value for all frames of reference connected by the Lorentz 
transformation; this invariant dr is called the proper time interval. We see 
from (3-34) that dr = dt) = the time interval measured in the system in 
which the particle is at rest (u = 0). 
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Suppose we divide both sides of the infinitesimal Lorentz transforma- 
tion (3-32) by dr; the result is 


fen), 2 


dr dr dr dr dr 

(3-35) 
de’ _ dz dt! _ »( Zt pe) 
dr dr’ dr dr cdr 


Using (3-34) and comparing (3-35) with (3-2), we see that the four 
quantities 


dx dz = u, 
dr dtJ/1—w{e V1 — w/c? 
dy _____iy 
dr Vi — u/c? 
(3-36) 
au, 
dr V1 — wfc? 
dt 
dr V1 — usc? 


transform in exactly the same way as do x, y, z, t. Therefore, since we know 
that 


x= yx’ + Ber), t= y(t + Bx'/c) 


we can immediately write the transformation equations for the analogous 
quantities dx/dr and dt/dz as 


u u, 1 
————— 2 = yy | —— 2 4s op —— 3-37 
V/1 = u2/c? | = u*/c? “4 _ =| ( ) 
1 = |——— 1 u,’ | 
vie u/c? V1 — ule u’*/¢? +a u’*/c* 


and, on dividing (3-37) by (3-38), we obtain 


(3-38) 


u, tov 
1 + (vu,'/c*) 
which is exactly the velocity transformation formula (3-14). The remaining 
two formulas (3-16) and (3-17) can be derived in exactly the same way. 
The point of view implied by our use of (3-37) and (3-38) is quite 
different from our previous method of derivation, which depended on 
differentiating the transformation equations and then eliminating terms. 


2 
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We were able to obtain (3-14) in this second way because we knew how 
the quantities (3-36) were affected by a Lorentz transformation, that is, 
because we knew their transformation properties. 

Thus this example has shown us that we should be able to write the 
transformation equations for certain quantities quite easily, provided that 
we know something about their general properties. Accordingly, it 
appears that this whole question is worth looking into more deeply and is 
the motivation for what we are going to do next. 


Exercises 


3-1. The average lifetime of a u-meson before radioactive decay as measured 
in its ‘“‘rest’” system is 2.22 x 10~® second. What will be its average lifetime for 
an observer with respect to whom the meson has a speed of 0.99c? How far will 
the meson travel in this time? 

3-2. A rigid rod of length L makes an angle 6 with the x axis of the system in 
which it is at rest. Show that, for an observer moving with respect to the rod, 
the apparent length L’ and angle 6’ are given by 


L’ = L{(cos 0/y) + sin? 6]%, tan 6’ = y tan 0 


3-3. Show that two successive Lorentz transformations corresponding to 
speeds v, and v, in the same direction are equivalent to a single Lorentz trans- 
formation with a speed v = (v, + v,)/[1 + (v,v,/c*)]. What relation does this 
result have to (3-14)? 

3-4. Derive (3-16) and (3-17) by the methods used to obtain (3-37) and (3-38). 

3-5. Find the transformation laws for the components of the acceleration, 
a, = du,/dt, etc. Show from these results that, if ay’ is the acceleration in the 
system in which the particle is momentarily at rest, 


a, = Azo/Y?, a, = Ayo/Y", a, = Azo] y? 


4 General Lorentz transformations 


Let us consider the two rectangular coordinate systems shown in Fig. 4-1. 
They have a common origin but otherwise differ by a rotation; that is, if 
we rotate one set of axes as a rigid body, it can be brought into coincidence 
with the other. The exact rotation could be specified, for example, by 
giving the direction cosines of the primed axes with respect to the un- 
primed axes. 

Let us consider a point which has the coordinates z, y, z in one set of 
axes, while in the other set it has the coordinates 2’, y’, z’. Both of these 
different sets of numbers locate the same point P. 
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Fig. 4-1 


It is evident that the distance r of P from the origin is an invariant; 
that is, it has the same numerical value regardless of which coordinate 
system we are using. Thus we have 


re = 72 + y? + 22 = 72 + y’2 + 2/2 (4-1) 


If we introduce the notation 


=r, ey, re (4-2) 
then (4-1) can be written 
3 3 
r= dx => 2," (4-3) 
i=1 i=1 


The equations which relate the two sets of coordinates can be expected 
to be linear, so that we can write 


3 . 
j=1 


That is, the three equations 


Ly" = Ay, + AyeTe + Assy 
Le = AgX, + Age%e + Aggy (4-5) 
Zz = Agi, + AgeTe + Aggts 
The set of nine numbers a,, characterizes the rotation connecting the 
primed and unprimed axis systems, and, in fact, the a,, can be related to 


the various direction cosines. It is also clear that these a,, cannot all be 
independent because the transformation equations (4-4) have not yet been 
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required to satisfy the condition of keeping r? invariant as given in (4-3); 
that is, they must yet satisfy a fundamental requirement that (4-4) describe 
a rotation. We could go on to elaborate this treatment of the three- 
dimensional case which we have begun, but it is just as easily done for a 
more general situation to which we now turn. 


4-1 Lorentz transformations as rotations 


We recall from (2-3) that the basic condition for the Lorentz trans- 
formation was the invariance of the quantity s? given by 


g2 _— x2 + y” + g2 = c2f2 —_ a2 + y'2 + 2/2 —_— c2r'2 (4-6) 
If we introduce the notation 
=X, =Y, %=2, %= ict (4-7) 
then (4-6) can be written 
4 
= > x,? = > x ,'? (4-8) 


On comparing (4-8) and (4-3), we see that the most general Lorentz 
transformation can be interpreted as a rigid rotation of axes in a four- 
dimensional space whose axes are 2, Xe, %3, X4 = ict. 

We can write the transformation equations relating the two sets in the 
general linear form 


x,’ = 2, Ay (u = 1, 2, 3, 4) (4-9) 


The transformation must be linear for, if it were not, it would give a 
preferred status to whatever origin we happened to choose, and this would 
violate the first postulate by providing an objective way of distinguishing 
one coordinate system from the other. 

In order that (4-9) represent a Lorentz transformation (four-dimensional 
rotation), (4-8) must be satisfied; we would like to learn what requirements 
are thus imposed on the sixteen coefficients a,,. Substituting (4-9) into 
(4-8), we obtain 

> 2," = 2(z Aayy) (x Qyy2,) = 2(2 Vaya) Ey (4-10) 
u v py 


Since (4-10) must also equal 2,77, we see that we must have 


> 4ay@ay = 0 ifu xy 
A 


> a,,° =1 ifu=y 
A 
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or, expressed more simply, 
> Baya = Ouy (4-11) 


which is our condition for the rigid rotation. 
We can also write 


— > by, 2,’ (4-12) 
A 


where the b,, are a set of coefficients which are evidently related to the a,, 
of (4-9). To determine this relation, we substitute (4-12) into (4-9) and 
find that 


z,/=> an(> bya,’ => md Ayyb,a) =) 2,'6,, (413) 
v A A v A 
after we have expressed z,,’ in terms of a sum in the last step. Comparing 
the last two terms in (4-13), we see that 
Dd @uvPva = Sua (4-14) 


The last result can be solved for the b’s by multiplying both sides of (4-14) 


by a,,, Summing over yw, and using (4-11): 
2 AypA Quy by, = 2 Fup Owa = 4, = 2 va 2, Aun ayy = > by, Oy =? Do, 
(4-15) 
Thus 5,, = a,,, and we can therefore say that 
if z,’ = p2 a,x,, then z, = x Ay, Ly. (4-16) 
and also that 
2 42403 = by, (4-17) 


which can be found by substituting (4-15) into (4-14) and interchanging 
the indices A and ». 

When the notation of (4-7) is used, the transformation equations for the 
particular Lorentz transformation (3-2) which we have been using up to 
now become, respectively, 


ay = yt, + iByay, Xe = 2, 

Ly = Xs, Ly = —iPyx, + yxy 
By comparing (4-18) and (4-9) we see that the coefficients a, for this 
special transformation can be written as the matrix 


(4-18) 


Qy, Ayo Ayg Ay 14 0 0 ipy 

i= Qe, Qe2 Gog aoa = 0 1 0 0 (4-19) 
G3, Asx. As, Ag, 0 0 1 0 
Aq, Qg2 gg Aga —iBy 0 0 y 
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It will be left as exercises to verify that (4-19) satisfies (4-11) and (4-17). 
It should be mentioned that a rigid rotation of the axes in three-dimen- 
sional space will also keep the expression x? + y? + z? — cf? invariant 
because of (4-1) and its independence of the time. Thus such a physical 
rotation of coordinate axes should be included in the group of general 
Lorentz transformations described by (4-9). 
When the condition (4-8) is written in differential form, it becomes 


(ds)* = > (dz,)? = > (dz,'y (4-20) 


In this case the quantity (ds)? is called an interval. The interval is always 
real, but in contrast to the distance between points in three-dimensional 
space it need not be positive and, in fact, we can have 


(ds)? £0 (4-21) 


If (ds)? > 0, the interval is called spacelike; if (ds)? < 0, it is called 
timelike. 


4-2 4-vectors and tensors 


DEFINITION. An invariant is a quantity whose value does not change in 
a Lorentz transformation. Examples are s* of (4-6) and the proper 
time dr of (3-34). 


DEFINITION. A 4-vector A, is a set of four quantities (A, A,, A3, A,) 
whose transformation properties are the same as those of the coordinates. 
That is, if the Lorentz transformation is described by (4-9), the compo- 
nents A, and A,’ are related by 


A,'=>a,A, (w= 1,2, 3,4) (4-22) 


and the coefficients a,, are exactly the same as those in (4-9). 


It follows from this definition that the first three components (A, Ag, As) 
must be the components of an ordinary three-dimensional vector A. 
The sum of two 4-vectors must be a 4-vector so that 


C, =(A + B), = A, + B, (4-23) 


We can define a scalar product of two 4-vectors by generalizing the 
result obtained for the dot product A - B as given by (I: 1-11); thus 


Scalar product = > A,B, (4-24) 
mM 


34 = Introductory Topics in Theoretical Physics 


THEOREM. The scalar product is an invariant. In order to prove this, 
we use (4-22) and (4-11): 


» A,’B,’ = 2(2 ayA,) (x a,,B,) a > A,B, > Ayve yr 
I v 2 vA 


ad ad 


= > A,B,6,,=> A,B, qed. Gree) 
vA v 


The square of the “‘length’”’ of a 4-vector is defined as the scalar product 
of the 4-vector with itself; thus 
(“length”)? = > A,” (4-26) 
yt 
Besides the coordinates, we have already met an example of a 4-vector; 


it is the 4-velocity U,, whose components are obtained from (3-36) and 
(4-7) as 


dx u 
U — — = —_—_—*__ 4-27a 
‘dr ys u*/c® 
a (4-27b) 
* dr V1 — ufc? 
dx u 
U.= —3 — ee 4-27¢ 
3 dr V1 _ u?/c? ( ) 
nl (4-274) 
“dr vi u*/¢? 
where 
wa=ue+u?+u,? (4-28) 


We know that these four quantities constitute a 4-vector because we 
showed in (3-37) and (3-38) that their transformation properties are the 
same as those of the coordinates. 

We find from (4-26), (4-27), and (4-28) that the length of the 4-velocity 
is given by 

(length)? = > U,? = Ue buy tue — Cc =i (4-29) 
. rig 1 — (u?/c*) 

and is negative. 


DEFINITION. A second rank tensor T,, is a set of sixteen quantities 
such that each index transforms in the same way as do the coordinate 
indices; thus, if (4-9) holds, then 


ie x > Anr~Ayy 1 rp (4-30) 
p 
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Two indices are needed to specify each component; hence the term 
“second rank.” Similarly, a 4-vector with its single index is a tensor of the 
first rank, while an invariant scalar is a tensor of zero rank. One can 
define tensors of higher rank by a straightforward generalization of (4-30), 
but we shall not need them. 

A tensor is symmetric if T,,, = T,,; hence a symmetric tensor has only 
ten independent components. A tensor is antisymmetric if T,, = —T,, 
and therefore has only six independent components since the diagonal 
elements must be zero; that is, T,, = 0. A tensor does not have to be 
symmetric or antisymmetric, but we now want to show that, if it is one or 
the other, this symmetry property is not changed by an arbitrary Lorentz 
transformation; that is, if T,, is symmetric (or antisymmetric), T,, is also 
symmetric (or antisymmetric). In order to prove this let us write 


Ls = + T,, (4-31) 


and choose the upper sign for the symmetric case and the lower for the 
antisymmetric case. We find from (4-30) and (4-31) that 


Tey = p3 AnsAyy Ty, ad +2 Anj~Ayy1 pa (4-32) 
p Pp 


If we now interchange yu and » in (4-30), and then interchange p and A, we 
obtain 


T,, = 2 AyAy Ti, = > By .AnrTpa (4-33) 
pP pP 


On comparing (4-32) and (4-33) we see that T,,’ = +T7,,,' with the choice 
of signs the same as that used in the original system as given by (4-31), and 
thus we have shown that the symmetry property is unchanged. 

An important example is 


F,, = A,B, — A,B, (4-34) 


Although we have written this set of sixteen numbers as F’,,, we have not 
yet shown that F.,, is really a tensor, although, if it is, it will be antisymmet- 
ric. We now want to show that F.,, defined in (4-34) actually has the 
transformation properties of a second rank tensor. Substituting from 


(4-22) into (4-34), we find that 


Pig ad A,B,’ —" A,'B,, = > Qy,4y,A,B, — 2 Ay 4 y,,A,B, 
* : (4-35) 
= 2 a,,9,(A,B, = A ,B,) = 2 AyjAypF 4, 
Pp p 


which, on comparison with (4-30), is seen to complete the proof. We also 
note that, if u, » = 1, 2, 3, the components of F’,, are precisely those of the 
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ordinary three-dimensional cross product A x B, which is accordingly 
seen as actually being a tensor rather than a vector. 
By using the same methods, it can be shown that the following quantities 
are 4-vectors: 
B,=)>A,T,, and C, =) T,,A, (4-36) 


4-3 Differential operators 


According to the rule for differentiating a function of given variables 
(unprimed) with respect to different variables (primed), we have 


08 _¥ 2% 0 (4-37) 
Ox,’ ‘v Ox,’ Ox, 
From (4-16), we see that dx,/dx,,' = a, so that (4-37) becomes 
7) 0 
— = — 4-38 
0x,’ 2 mike Ox, ee) 


On comparing (4-38) and (4-22), we see that the four operators 0/dz, 
(u = |, 2, 3, 4) transform exactly like a 4-vector. In fact, we can define a 
four-dimensional del operator [_] with these components: 


o=|2,2, 2,2) (4-39) 
Ox, OX, OX, Ox, 
and use it in a fashion analogous to the three-dimensional operator \/. 


If m is an invariant, we can define a gradient [_]@ as the 4-vector whose 
components are dy/dx,,. Similarly, we can define 


Divergence: 
0A, : : 
an (an invariant) (4-40) 
uw OX, 
Four-dimensional Laplacian: 
(v=> oe Vr- ae (an invariant) (4-41) 
i Ox,” c? or 
Curl: 
v= oe — le (an antisymmetric tensor) (4-42) 
Ly Ly 
Divergence of a tensor: 
> OT = D, (a 4-vector) (4-43) 


v Or, 
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We note that, if uw, »v # 4 in (4-42), the corresponding components of 
@ ,, are the components of the three-dimensional vector, curl A. 


Although this discussion can be carried much further, the amount of 
tensor analysis we have developed here will be sufficient for our purposes. 


4-4 The use of 4-vectors and tensors in relativity 


It is wise to pause at this point and remind ourselves of the purpose of 
what we have just been doing. If we look back at the first postulate, we 
see that it says, in effect, that there should be no way of making an absolute 
distinction between two systems moving with constant velocity with 
respect to each other. With the help of the second postulate, we found 
that observations made in the two systems must be correlated by means of 
the Lorentz transformation. Combining these two results, we shall see 
immediately that we can say that the two postulates taken together require 
that the laws of physics when properly formulated must have their form 
unchanged when subjected to a Lorentz transformation; that is, they 
must be covariant with respect to the Lorentz transformation. In order to 
see that this is correct, let us investigate what the consequences would be 
if the last statement were not correct in a very simple case. Suppose that a 
particular law we were considering had the general form & = F¥ + G. 
Let us suppose also that when we referred everything to the primed 
system by means of a Lorentz transformation, this law became é” = 
F' + G +, where «’ depended on the particular primed system 
involved. This equation is clearly not covariant, and, in fact, the very 
existence of the e’ term would enable us to distinguish among the various 
systems in an absolute way. As this would definitely violate the first 
postulate, we can see the reason for requiring covariance with respect to a 
Lorentz transformation. 

We have seen in the last sections that, by their very definitions, 4-vectors 
and second rank tensors are covariant. Thus it is evident that, if we were 
to express all our physical laws in 4-vector or tensor form, they would be 
automatically covariant with respect to the Lorentz transformation and 
would satisfy both postulates of special relativity. With this in mind we 
can see, in essence, what our next considerations will be. We shall look 
at some physical laws and first of all determine if they can be written in 
terms of 4-vectors or tensors. If they already are or easily can be, we need 
do nothing more because we know they are already valid in special 
relativity. If the laws are not covariant, yet are correct in the non-relati- 
vistic case, then our task is to try to generalize these results so that they can 
be expressed in terms of 4-vectors or tensors and thus be compatible with 
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special relativity. There are two points to remember in connection with 
the latter situation, however; first, our generalizations will always have to 
reduce to the known valid results in the non-relativistic limit; second, our 
generalizations will still need to be tested by experiment because the 
process of generalization to 4-vector and tensor form is not necessarily a 
unique process. In the next chapter, we shall begin this program by 
considering the cornerstone of physics. 


Exercises 


4-1. Discuss briefly what the consequences would be if the transformation 
equations (4-4) were not linear. 

4-2. Show that the coefficients a,, given in (4-19) satisfy the conditions (4-11) 
and (4-17). 

4-3. Show that, if the interval between two events is spacelike, there exists a 
frame of reference in which the two events occur simultaneously; similarly, if 
the interval is timelike, show that there is a reference frame in which they occur 
at the same point. 

4-4. Show that, if 2,A,B, is an invariant for any arbitrary 4-vector 4A,,, 
then B,, is also a 4-vector. 

4-5. Show that the quantities defined in (4-36) are 4-vectors. 

4-6. Use the transformation properties of the 4-acceleration defined by 
a, = aU,|/dr = d*x,,/d7* to derive the transformation formulas for the ordinary 
acceleration a which were previously obtained in Exercise 3-5. 

4-7. Derive the set of coefficients a,, which describes a general Lorentz 
transformation consisting of a 30° rotation about the y axis plus a translation 
along the rotated x’ axis with a constant speed v = 3c. 


& Particle mechanics 


The basic equation of the non-relativistic mechanics of a mass point 
subject to the force f is 


_ : 
= (5-1) 
where 
Pp = Mu (5-2) 


is the linear momentum of the particle in terms of its velocity u and its 
inertial (or rest) mass mp. Equation (5-1) is clearly not in 4-vector or 
tensor form because it has only three components, the velocity u is not a 
4-vector, and the differentiation is with respect to the time which, we know, 
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is not an invariant scalar. Therefore we have to generalize these equations 
to make them satisfy the requirements of special relativity. 


5-1 4-momentum and 4-force 


We begin with the momentum. Although u is not a 4-vector, it is 
closely related to the 4-vector U,, = dz,/dr. Hence a plausible generaliza- 
tion of (5-2) is to use the invariant scalar m, and define the 4-momentum 
P,, as 

P, = mU, (5-3) 
We can write (5-3) in component form by substituting from (4-27); 
we then obtain 


wi oa OE ee 
P,= Toute (i = 1, 2, 3) 
; (5-4) 
MoC 
Pe 
; V1 — u/c? 


We see that, as u/c —> 0, P; > mu; which is the non-relativistic momentum; 
thus (5-3) appears to be a reasonable choice. For the moment, we ignore 
the extra component P, which we have introduced by this process. 

In order to get an equation of motion which is analogous to (5-1), we 
differentiate (5-3) with respect to the invariant dr and define the 4-force 
F,, by 

d d*x 
—! = — (m,U,) = my —+ 
‘ dr aes ° dr? 


This result is our desired generalization of the Newtonian equation of 
motion; the 4-vector F,, is also called the Minkowski force. 

If we desire to relate (5-5) to the ordinary force components, we can 
substitute for dr from (3-34) so that (5-5) becomes 


(5-5) 


u» d 
F,q/' = Za = dt (m)U,,) (5-6) 


and, if we use (4-27) to write out the first three components of (5-6), we 


find that 

2d Mou; 

raft -2=4( ns — tt) = f (5-7) 
ce dt\V1— we? 


where the f; must be the z, y, z components of the ordinary force f since 
(5-7) must reduce to (5-1) and (5-2) combined when u/c « 1. Thus we 
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see that the first three components of the Minkowski force can be simply 
related to the ordinary force by means of the equations 


fi , 
PS = 1,2,3 5-8 
aa ) (5-8) 


In practical calculations, one often does not want to deal with the 
4-force but prefers to use simply the three equations (5-7); thus it is 


common practice to write 
r-4 (2) (5-9) 

dt\J1 — w/c? 
and to call this the relativistic equation of motion. Then, if one wants to 


continue to regard force as the rate of change of momentum, (5-9) can be 
written f = dp/dt, where 


p = mu (5-10) 
and 
nie (5-11) 
V1 — u/c? 


The quantity m introduced in this way is then called the mass of the 
particle because of the analogy between (5-10) and the non-relativistic 
equation (5-2). If this procedure is followed, (5-11) is the basis for the 
statement that the mass of a particle is no longer a constant but increases 
as the speed increases. 

On the other hand, it is not at all necessary to interpret our results in this 
way, and doing so follows only from a natural desire to write momentum 
always as the product of the mass and the ordinary velocity u. In fact, 
such an approach actually contradicts the basic philosophy of the covariant 
approach of relativity because (5-10) is not in 4-vector form. It is much 
more in keeping with relativistic concepts to ascribe an invariant scalar 
property—the rest mass m)—to the particle and then define the 4-vector 
momentum as the product of this scalar invariant with the 4-vector 
velocity, exactly as we did in (5-3). 


5-2 Energy and the 4momentum 


It is now appropriate to take account of the fact that in our process of 
generalization we began with (5-2) which has only three components and 
ended up with (5-3) which has four. We have seen that the first three 
components of P,, can be adequately interpreted, and now we want to 
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look at the “extra” one. We shall find that it is really not so unfamiliar 
after all. 
If we introduce a new quantity W by 


ps (5-12) 
c 
we find from (5-4) that 
2 
= Wee = : 
W= toate (5-13) 


In order to interpret (5-13), let us look at its approximate form when 
u/c «1; expanding the denominator, we find that 


2 
w= mec*( + 523 Lae +) = mc" + smu? +:-:- (5-14) 


The second term can be recognized immediately, for it is simply the 
ordinary expression for the kinetic energy of the particle in Newtonian 
mechanics. Accordingly, it seems quite reasonable in this more general 
case to call W the total energy of the particle. We see that, if the particle 
is at rest so that u = 0, the value of W is mgc? which is called the rest 
energy. It is therefore customary to regard the total energy of a particle 
as being composed of two parts—an intrinsic part due to its rest mass 
(the rest energy) and the additional part which appears when the particle 
is moving (the kinetic energy). Thus, if we let T be the kinetic energy, we 
can write 

W=mec+T (5-15) 
so that 


T= moc*| : — 7 (5-16) 


2/2 
according to (5-13). v1 une 
We see now that the fourth component of the 4-momentum is not 
completely mysterious but is directly proportional to the energy of the 
particle. We also see that the linear momentum and energy of a particle 
are not to be regarded as different entities, but simply as two aspects of the 
same attributes of the particle since they appear as separate components of 
the same 4-vector. 
We can obtain the same result quantitatively in the following manner. 
The components of the 4-momentum are 
P,=P,, P2=P, P3;=P, Py= ud (5-17) 
and transform according to ° 
P=) aP; (5-18) 
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because of (4-22). Substituting (5-17) into (5-18) and using as an example 
the particular Lorentz transformation described by (4-19), we find that 
the equations appropriate to this case are 


ee ee a 
P= P. . ? Py =P, (5-19) 


Pe = Pe W' = y(W — BcP,) 


Clearly, what appears as energy in one system appears as momentum in 
another, and, conversely, what appears to be momentum in one system is 
energy in another. Asa simple illustration, let us consider the specific case 
in which the particle is at rest in S’. Then u’ = 0, so that P,’ = P,’ = 
P,, =0 and W’ = mc’, according to (5-4) and (5-13). We then find 
from (5-19) (or its equivalent obtained by interchanging primed and 
unprimed symbols and changing the sign of B) that 

p: = YBW! 


zs” 9 
C 


P,=P,=0, W=yw' 


and, therefore, 
Mo yy Mo 
V1 — vc? V1 — ?/c? 
These results, however, are exactly those we would expect to get, according 
to (5-4) and (5-13), since, from the point of view of the observer in the 
unprimed system S, the particle is moving along the x axis with speed v. 
In summary, then, the particle possesses only energy in S’ but has both 
energy and momentum with respect to S. 
We can now look at the fourth component of the Minkowski force. 
We find from (5-5), (5-12), and (3-34) that 
Fy =e iW _ i _ (5-20) 
dr cdr cV1l1—wyec? dt 
and is therefore related to the time rate of change of the energy, or to the 
rate at which the force is doing work on the particle. 
We can see this in quite another way and can further justify the inter- 
pretation of W as the energy by proceeding as follows: From (5-3) and 
(4-29), we find that 


DP! = med Ut = —m,rc? (5-21) 


P= 


and, on differentiating with respect to 7 and using (5-5), we obtain 


dP 
rP, —= > P.F, =0 (5-22) 
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(Since the scalar product of P, and F,, is zero, we can say that the 4- 
momentum and the 4-force are always “perpendicular.’”’) If we write out 
(5-22) in detail and use (5-4), (5-7), and (5-20), we find that 


_ _ _ mou -f =( imc )( i jo 
(Q— wie?) \W1 — w/e?) \eV1 — u/c? dt 
so that 
oy ae (5-23) 
dt 


Therefore the time rate of change of W is equal to the rate at which the 
force does work on the particle, and, since this is the way in which increase 
of energy is defined (I: 4-16), the result (5-23) is further justification for 
our interpretation of W as the energy. 

The energy can be expressed in terms of the linear momentum by 
writing out (5-21) and using (5-17); the result is that 


3 

W? = c? ¥ P;?? + (mgc”)? = (Pc)? + (myc")? (5-24) 
t=1 

which is a convenient starting point for the development of the Hamilton- 

ian formulation of relativistic mechanics. 

Other important aspects of mechanics which we have not yet mentioned 
explicitly are the conservation laws of momentum and energy. Since we 
have seen that we can no longer consider energy and momentum sepa- 
rately, it would seem that the natural relativistic generalization would simply 
be the conservation of the 4-momentum. In fact, this is exactly what has 
been found to be correct experimentally, and, in addition, this generalized 
conservation law holds for a system of particles, even when the number of 
particles and their rest masses are different in the initial and final states. 
The concept of 4-momentum conservation is particularly useful in the 
discussion of collisions. In quantitative form, this conservation law can be 
written as the four equations 


N N’ 
2 P,(i) = 2 P,*(j) (5-25) 


where P,,’(i) is the wth component of the 4-momentum of the ith particle 
before the collision (or general interaction); similarly, the superscript a on 
the right labels the values after the collision. In (5-25) there is also 
provision for the number of particles to change from N to N’. The fact 
that the sum of P, is also conserved shows that the rest energies and 
kinetic energies need not be conserved individually, although their sum 
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must be. In other words, rest mass and kinetic energy can be converted 
into each other. The conservation law (5-25) has been well verified 
experimentally, particularly in reactions involving atomic nuclei and in 
collisions of high-energy particles of various kinds. The excellent agree- 
ment of (5-25) with experiment provides additional strong evidence for 
the basic correctness of special relativity. 


Exercises 


5-1. Find the transformation laws for the components of the Minkowski force 
and of the ordinary force. Partial answer: Fy’ = y(F, + iBF,),f,'( — w/c?)* = 
ft — u*/c*)4, 

5-2. For many purposes, electromagnetic radiation can be treated as com- 
posed of small, localized ‘‘clumps’’ of radiation called photons. Show that, if 
we regard a photon as a particle of zero rest mass and total energy W = fv, 
the Doppler and aberration formulas, (3-27) and (3-30), can be obtained from 
the transformation laws for the 4-momentum P.,. 

5-3. Two particles, each of mass 4M, are connected by a compressed spring 
of negligible rest mass. The particles are tied together with a massless string, 
and the whole system is at rest in a coordinate frame So. The string is then cut, 
and the two particles fly off in opposite directions, each with speed uy. What is 
the initial potential energy of the system in S)? By explicitly transforming the 
velocities to another frame S, find the final energy and momentum, and thus the 
initial energy and momentum, in S. Then find the rest mass of the initial 
system in the frame S, and interpret the result. 


6 Electrodynamics in vacuum 


The theory of special relativity essentially assumes that Maxwell’s equa- 
tions are covariant with respect to Lorentz transformations, although we 
originally required only that the speed of light be an invariant. It is both 
instructive and useful to write Maxwell’s equations in 4-vector and tensor 
form. 

According to the results given in (I: Sec. 19-9), Maxwell’s equations for 
a vacuum in mks units are 


divE = £ (6-1a) 
€o 
curlE = — ob (6-15) 
Ot 
divB =0 (6-1c) 
curl B = oJ + be (6-1d) 


c* Ot 
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where we have written them entirely in terms of E and B and have used 
Uo€o = C*. We also have the equation 


J = pu (6-2) 


which relates the current density to the charge density p of the moving 
charges which have velocity u. These equations imply the equation of 
continuity A 
div J + - = (6-3) 


which can be obtained from (6-1a), (6-1d), and the vector identity div curl 
B= 0. 

We can use the results of Chapter 4 to transform the differential 
operators involved in these equations, but we would also like to know the 
transformation properties of p, J, E, and B. 


6-1 The 4current and the 4-potential 


We recall the results of (I: Chapter 31), namely, that we can write 
Maxwell’s equations in terms of a vector potential A and a scalar potential 
¢ by the defining equations 


B=curlA, E = —grad ¢ — aa (6-4) 
t 
and if we impose the Lorentz condition 
diva +12 —0 (6-5) 
c” Ot 
then the differential equations satisfied by the potentials are 
1 0°A 
veo 
1 ad (6-6) 
vV7%¢—-=—-=-— 
? c ar Eo 
which can be compactly written 
(PA = —pJ, Cd = -F (6-7) 
€o 


if we use (4-41). 
These equations, of course, still imply the equation of continuity, which 
we certainly want to be covariant; that is, we want 


Hvar 26 and rh fee lame (6-8) 
Ot or’ 
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If we introduce four quantities J, by means of the respective equalities 
given by 
(Vi, J, Js, J) — (J,, Jy» J, icp) (6-9) 


then (6-8) can be written 
5 we =0 and > ee 0 (6-10) 
7m Ox, J Ox,’ 


with the use of (4-7). 

Since (6-10) has the form of the divergence of a 4-vector, this result 
makes us strongly suspect that J, as defined in (6-9) is a 4-vector. We now 
show that it actually is a 4-vector. Let us consider an element of volume 
dV in a coordinate system S; the charges in dV have the velocity u. The 
total charge contained in dV is p dV. We now consider another coordinate 
system So, the rest system, which is defined as that in which the charges 
are at rest so that uy = 0. In the volume element dV, in S, which cor- 
responds to dV of S, the total charge is py dV, where py is the charge 
density in the rest system. We make the natural assumption that total 
charge is an invariant; that is, we cannot expect to change the total 
amount of charge involved in some phenomenon (such as the charge of 
an electron or proton) by merely observing it from a different coordinate 
system. Therefore, equating the charges, we obtain 


Now the relative velocity of the two coordinate systems is u; since the 
dimensions along the relative velocity are connected by the Lorentz 
contraction formula (3-10) while the dimensions transverse to the relative 
motion are not affected, the two volumes are related by 


A-4 
dv= J 1-4 (6-12) 


When (6-11) and (6-12) are combined, we obtain for charge densities the 
transformation formula 


=e 6-13 
p Ana (6-13) 


Using (6-2), (6-13), and (4-27a), we find that 


Polls 


J,= pul, =o 
V1 — usc? 


= pV 
Similarly, we find that 


Jy = poU2, J, = poUs, icp — po 
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so that (6-9) can be written 
J, = pol, (6-14) 


showing that J, is actually a 4-vector since it is the product of the 4- 
velocity and the scalar invariant pp (rest system charge density). The 
4-vector given in (6-14) and related to the more usual quantities by (6-9) 
is called the 4-current. The transformation properties of J,, are, of course, 
given by (4-22), or, to put it another way, we see that J,, J,, J,, p transform, 
respectively, like z, y, 2, t. Therefore, for our particular Lorentz trans- 
formation (2-10), we can immediately say that 


J, = v(J, a vp), Jy ar Jy 


(6-15) 
’ ’ vd, 
ate tnale 
with the inverse equations 
: , , vd, 
J,=yJe + vp), p= rc Bg ve.) (6-16) 


As an example of the use of these equations, let us consider the case in 
which S" is fixed within a material body which moves with constant speed 
u relative to S; then we can replace v by u in (6-15) and (6-16). If we 
consider the non-relativistic case for which u/c « 1, then y ~ 1 and (6-16) 
yields 

J,~ J, +up', pp’ (6-17) 


Thus, while an observer on the moving body measures a charge density p’ 
and a current density J,’, his colleague on S finds the current J,’ to be 
augmented by the convection current up’ which is due to the motion of the 
charge density p’ with respect to S. 

If we define the 4-vector A, by 


(A,, Ag, As, A,) = (4. A,, Ai; is) (6-18) 
c 
then, with the use of (6-9), the four equations of (6-7) can be combined 
into 
(RA, = —Hol, (6-19) 
since 


L]*(ip/c) = —ip/ceg = —Jy|c?€y = — Lgl 
while (6-5) becomes simply 
>+>—=0 (6-20) 
u Ox, 

This 4-vector A, is called the 4-potential. 
Equations (6-19) and (6-20) are the covariant forms of Maxwell’s 
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equations as written in terms of the potentials; at this stage, the fields 
must be still found by the use of (6-4). 


6-2 The electromagnetic field tensor 


In order that Maxwell’s equations be covariant, we want B = curl A to 
hold in S, while in S’ we have B’ = curl’ A’; thus, in principle at least, 
we know how the components of B transform. To find the transformation 
properties, we can begin by considering one of the components of the 
equation determining B; thus we obtain 


b= SS a = I (6-21) 


with the use of (6-4), (4-7), and (6-18) and where we have introduced the 
antisymmetric tensor 
0A 0A, 
,=—- 6-22 
Su Ox, Ox, ( 
which is the four-dimensional curl of A... 

If we continue this procedure and calculate all the other components of 
J in the same way, we find that 


fa == —fis = B,, hie = —fos = B, 


iE. iE 

Sis = —fa a ea oe ’ Soa = —fiap = — (6-23) 
IE, 

Sag = —fia3 ao ae 


Thus the field vectors E and B can be written as components of the second 
rank antisymmetric tensor f,, according to the matrix representation 


0 B, —B, —iE, 
Cc 
-B, 0 B, =” 
Cc 
Suv = _ (6-24) 
B, —B, 0 LZ 
Cc 
iE, ky iF, 9 


Cc c Cc 


This tensor is known as the electromagnetic field tensor. Since we know 
how the components of a tensor transform, we are able to find the 
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transformation properties of the field components; we shall do this in the 
next section. 

It can also be shown that all of Maxwell’s equations as written in terms 
of the fields are contained in the following system of equations: 


fio , Sov , Ava 
—~++5°+-"=0 6-25 
Ox, ‘ Ox, i Ox, oe) 
fu 
> cs = fod u (6-26) 
v OX, 


As an illustration, let us consider the first component of (6-25), that is, the 
form in which the index one is not used; with the help of (6-24), we find 
that 


Aur, Ye, roy 196: | ide, , 1 2B, 
Ox, Ox3 On, c Oy cd@ ic Ot 
which can be rearranged as 
OE, OE, ao OB, 
Oy Oz Or 


and which is just the x component of (6-15). Similarly, it can be shown 
that the remaining three components of (6-25) are the y and z components 
of (6-15) and the single equation (6-Ic). 

If we set uw = 1 in (6-26), and use (6-24) and (6-9), we obtain 


a Ais 4 Bin 4 Miso yyy, = 28: AB _ (1) iE 
Ox, OX, Ox, O2, ‘Oy ic ot 
which can be written 
OB, OB 1 0E 
Zz _ __Y a J ae H 
iy os ee 


and is seen to be the x component of (6-1d). Similarly, the values uw = 2 
and 3 give the remaining components of (6-1d), and when yu = 4 is used 
in (6-26) we find that the result is (6-la). Thus we see that (6-25) and 
(6-26) are actually Maxwell’s equations (6-1) written in terms of the 
electromagnetic field tensor. 


6-3 Transformation equations for the electromagnetic field 


According to (4-30), the transformation law for the electromagnetic 


field tensor is 
bi = 2 AnsMvpSip (6-27) 
p 
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We shall consider only the particular Lorentz transformation for which 
the a,, are given by(4-19). Asan example, let : consider the 14 component 
of (6-27); using (4-30), (6-24), and f,, = —f,,, we find from (6-27) that 


iE,’ 
- = 2 41%4pSap = 2 a(dafar + Aas fis) 
p 


= Ay (Aafia + Aaafia) + Ara(Garfar + Gas Sea) 
= (441044 — Aaa) fis = (y? ae By") fis 
iE. 
= fis =" 
Cc 


and, therefore, E,’ = E,. Similarly, for the 42 component of (6-27), we 
find 
iE,’ 
= p2 Asi AopS ip = 2, AgAoof ie 
p 


IE, 
= An fie + Qufae = (—iBy)B, + y — 


Sar’ — 


Cc 


and, therefore, E,’ = y(E, — vB,). Proceeding in the same way, we can 
find the complete set of transformation formulas: 


E,' =E, B,’ = B, 


, vE, 
E,’ = y(E,—vB,)_B,' = 7(B, ar (6-28) 


’ , vE 
E,’ = y(E, + vB,) _B, = (8. - 2) 


The inverse transformations are 
E,=E,, E,= y(E, + vB,), ete. 


We can actually discard as unnecessary the restriction that the relative 
velocity v of S’ with respect to S be along the z axis. Since the orientation 
of the zx axis is completely arbitrary, if we introduce the field components 
parallel (||) and perpendicular (_|_) to the direction of translation, we can 
write (6-28) in general as 


E\' = E, By’ = B, 


E,’=y(E, +vxB_) B,’=(B, — xB.) (6-29) 


= y(E+v xB), = 7(B-% xe) 
C L 
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Example. Pure Electric Case in S. Suppose E #0 in S, but B= 0. 
Then we find from (6-29) that, in S’, * 


By =0, By’=—UvxE, 
Cc 


E\' = E), EF.’ = yE, 
so that 


Pap Se = SS eS (6-30) 


Thus we see that what appeared to one observer as purely an electric 
field appears as both an electric and a magnetic field to a second 
observer moving with respect to the first one. 


Example. Pure Magnetic Case in S. Suppose now E= 0 but B ¥ 0. 
Then we find from (6-29) that, in S’, 


so that 
‘=E,'’=vxB'=vxB (6-31) 


Thus we see that what appeared to one observer as purely a magnetic 
field appears as both a magnetic and an electric field to a second 
observer moving with respect to the first one. 


In the non-relativistic limit in which v/c < 1, y= 1 and the results of 
the last example show that B’ ~ B; hence 


E’'~vxB (6-32) 


Thus in the case in which the moving observer is not going too fast, the 
electric field is given by (6-32) and is exactly what we gave in (1-2) as a 
first approximation to be used in discussing electromagnetic effects in 
moving media—justifying our previous result which was obtained in 
another way. 

The results found in this section clearly show that the electric and 
magnetic fields E and B have no independent existence as separate entities. 
The fundamental complex is the field tensor f,,, and the resolution into 
electric and magnetic components is wholly relative to the motion of the 
observer. 

The transformation equations, (6-28) or (6-29), sometimes make the solu- 
tion of certain problems easier by enabling one to choose an appropriate 
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coordinate system in which the answer can be simply found and then 
to obtain the desired results by transforming back to the actual system 
of interest. Of course, one does not get any results this way that could 
not be obtained by directly solving Maxwell’s equations, but it is gen- 
erally faster. We shall illustrate this method with an instructive and 
important example. 


6-4 Field of a uniformly moving point charge 


We consider a point charge g moving with a constant velocity u with 
respect to the system S as illustrated in Fig. 6-1. 

Let us choose S’ to be the system for which q is at rest at the origin. 
The field in S’ is then just the electrostatic Coulomb field of a point charge 
as given by (1: 19-6): 


, 


’ qr ’ 
E’ = , B=0 6-33 
Azregr’® ee 
where 
r’ = (x2 + y'2 + 2'2)% (6-34) 


is the distance to the point at which the field is evaluated. Inserting (6-33) 
into (6-28) and using (6-34) and (2-10), we find one of the field components 
in S to be 


Le FA Cee) (6-35) 


where 
1 


= 6-36 
y Vina (6-36) 


Fig. 6-1 
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If we let X¥ = ut be the position of g along the z axis so that the coordinates 
of g in S are (X, 0, 0), then (6-35) can also be written 


E = qyv(z — X) : : 
"  Amely*(x — XP + y? + 2°)" mete 


The other two components of E can be found from (6-28) in the same way; 
the results are that 


qyvy 
E, = — + — —— 6-38 
ne (v(x — XP + yr +2)" Ce 


-_ qy% 
© Ire f@— x) + + = 
The last three equations give the electric field at the point (a, y, z) in S 
when the point charge q is at (X, 0, 0) in the same system. 
The value of B could also be calculated from (6-28); however, we can 
simply use the result given for the pure electric case (6-30) by taking 
account of the interchange of S and S’ so that we have 


_uxkE 


ce 


B (6-40) 
from which the components of B can be found explicitly by using (6-37) 
through (6-39). Thus we have found the exact solution of our problem 
in a comparatively simple manner. 

Let us investigate the characteristics of this field by choosing the instant 
that the charge is at the origin, that is, = 0. The basic structure of the 
field will be the same for all later times, and it can be obtained by simply 
translating the result for ¢ = 0 with speed u along zx, according to (6-37) 
through (6-40). Setting X¥ = 0 in the equations for E,, E,, and E,, we 
see that we can write 

= qyt —_ : 

. 4rre(y?x? + y? + 27)? co) 
which shows that E is directed radially outward, while B is perpendicular 
to the plane of u and E, according to (6-40); this situation is illustrated in 
Fig. 6-2. 

We also see from the figure that 


xz=rcos0, y* +2% = r* sin? 6 
so that if we use (6-36) we find that 


yrx? + y? + 2% = r*y2(1 — f? sin® 8) (6-42) 
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Fig. 6-2 
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Fig. 6-3 


Fig. 6-4 
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where BP = u/c. Substituting (6-42) into (6-41), we find 


= q(l — pe (6-43) 
4regr?(1 — B? sin? 0)” 
showing us that the field is inverse square in its dependence on radial 
distance r, but that its magnitude at a given distance depends on the 
direction. If we calculate the magnitude of the electric field from (6-43) 
for the two extreme cases, we find that: 


(1) directly in front, 6 = 0, 


E, = — (1 — 6°) (6-44) 
4reor 
(2) at one side, 6 = 7/2 
E, q (6-45) 


Ane or? V1 — B 
Therefore, for a very fast moving charge for which B = 1, we see that 
at a given distance E, is very small while E, is very large, whereas if 
B<«1 both of these components approach equality with the static 
Coulomb field. 

These results are also illustrated in Fig. 6-3 which shows the magnitude 
of the field for a given distance plotted as a function of angle @ from the 
direction of motion for 8 = 0.9. We see from this figure that for a very 
rapidly moving charge practically the whole field is concentrated in a 
small angle around the equatorial plane, an effect which is illustrated very 
schematically in Fig. 6-4. 


Exercises 


6-1. Why are we justified in calling the quantities defined by (6-18) a 4-vector? 

6-2. Verify the results of (6-23). 

6-3. Show that (6-25) has only four different components, and show that 
(6-25) and (6-26) give all of Maxwell’s equations. 

6-4. Show that the quantities E-B and B* — E?/c? are invariants. Apply 
these results to the case of a plane wave in vacuum. 

6-5. Show that the equations of motion of a particle of charge q in an electro- 
magnetic field where the force is given by 


f = g(E + ux B) (6-46) 


are given by 
md?*x,/dr? =q > f,,U, (6-47) 


6-6. A charged particle enters a uniform electric field E which is perpendicular 
to the initial velocity up. Find the subsequent trajectory of the particle, and 
show that it reduces to a parabola in the limit u/c — 0. 
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6-7. Show that the correct relativistic Lagrangian and Hamiltonian functions 
for a charged particle moving in an electromagnetic field can be written as 
L = —mec?v1 — w/c? + qA-u — g¢ 
H = [(p — gA)*c? + m2c4]% + g¢ 


Also write these quantities in 4-vector form and find the form of the correspond- 
ing Lagrangian and Hamiltonian 4-vector equations of motion. 


Part Two 


Thermodynamics 


¢ Mathematical introduction 


We shall find in our discussion of thermodynamics that we will often have 
a fairly large selection of variables from which the independent variables 
can be chosen, and it is frequently convenient to transform from one set 
to another. Since partial derivatives are so extensively used in thermo- 
dynamics, it is desirable to have explicit formulas at hand which we can 
use to transform the derivatives. Therefore in this chapter we discuss a 
few aspects of functions of several variables which are particularly appro- 
priate to thermodynamics. 


7-1 Transformations of partial derivatives 


Suppose we are given the function z = 2(z, y). By the usual rules of 
calculus, the total differential of z is given by 


aah “+ (5) 
dz= |—])d —]d 7-1 
: fs y a Oy x e ( 


The notation for the partial derivatives which is shown in (7-1) is almost 
universal in thermodynamics and requires a few words of explanation, 
for it is a convenient way of showing what the independent variables are 
and which are being held constant in the process of partial differentiation. 
Thus, for example, (dz/dx), tells us that the dependent variable z is a 
function of the independent variables x and y, and that the rate of change 
of z is being computed under the condition that y is held constant. 
Accordingly, we shall generally (but not necessarily always) have 


2). * (5), 


since z is regarded as a function of different independent variables in the 
two cases. 

This notation can easily be extended to more than two variables; if, 
for example, we have z = 2(x, y, a, b, c,...), then (02/02), a56..> 
(0z/0a), 1 ».0,.... etc., are possible partial derivatives. 
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Now suppose that, in addition, x and y are themselves functions of 
other variables u and v. Then their differentials would be given by 


22) (22) 

dx = |— ld —id 7-2 

a (5) ae 8 a ae 

dy = (24) au + (22) dv (7-3) 
Ou/» Ov/u 


If we substitute (7-2) and (7-3) into (7-1), we find that 
-[2@+@, 2): 
#=|(F) my as Oy/2 Ou/v . 
se) Gaol * Ge) ae 
—} |— —} |—]} | dv (7-4 
- (5) oe lay evoaen or 


Since we have x = z(u,v) and y = y(y, v), then z itself is actually a function 
of u and v, and we can also write 


o-@lar@e os 


and, on comparing (7-4) and (7-5), we see that 


(ad, ~ (ae) (Ga). * (Go) Ga), 80 
(5). = (53) (So. (ay) (Bol, 80 


are the formulas which enable us to transform the derivatives to different 
sets of independent variables. We now want to consider two important 
special cases of these results. 

Suppose that only one variable is being changed; that is, the new 
independent variables (u, y) replace the old set (x, y). Therefore, v = y 
and (dy/du), = (dv/du),, = 0 and (7-6a) becomes 


(53) = (53), (Gi) - es 


Oz /y 


which can be written 


2). (Gi) (Ee) = an 
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Or 


du 
Oz /y 


This result is called the chain relation, and we note that the same variable 
y is held constant in all of the differentiating. 

Now we shall let u = y and v =z; the net result is to change the 
dependent variable from z to z. In this case, then, 


(2) =o, (2) =1 
Ou v Ou v 
so that (7-6a) becomes 
z) (=) (=) 
0O=[-—] (— — 
. y oy 2 =v Oy x 
i) 
oy x 


Ox 
(57) ~ @ a 
Ox y 


22] (4) (=) — 


This result is called the cyclic relation. We note that the effect of (7-9) is 
to make the variable which is held constant on the left side become the 
dependent variable on the right side. 


which can be written 


or 


7-2 Legendre transformations 


Suppose we have a quantity z which is a function of ¢ variables, z,, so 
that we can write z = 2(z,, %.,..., 2,). If we define 


dz 
mas ie 7-11 
(5. Lie ee os Lips Lit]. woe ( 
we can write the generalization of (7-1) as 
t 
j=l 


It is often desirable to obtain a function related to z in which some of the 
variables x, are replaced by the corresponding p,; as the independent 
variables. Such a transformation is known as a Legendre transformation. 
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Suppose we want the variables z,, ... ,x,(/ < #) replaced as independent 
variables by the derivatives p,,..., p,. We designate the transformed 
function by 2[p,, ... , p,], and we shall demonstrate the rule: Subtract the 
product of the two variables whose role you wish to interchange from the 
original function. Therefore 


l 
2[p.,.---,. DJ =2z- > P 5%; (l<t) (7-13) 
I=1 


In order to verify the correctness of this rule, we calculate the differential 
of the transformed function and use (7-12): 


U 
dz[p,, erie ks Pi] = dz — > (p; dx; + x; dp;) 


J=1 


l t 
= — >) x,dp,+ > p; dz; (7-14) 
j=1 j=1+1 
We see from (7-14) that the independent variables are now (p,, ... , Pi, 
Z141,---, X,) as was desired. In addition, the previously independent x’s 


are now derivable by differentiation from the transformed function; that 

iS, 

ae Oz[P1,---» Pil 
OD; 


(j=1,...,) (7-15) 

while 

es dz[p,, pies toe: Pil 
Ox, 


One of the best-known examples of a Legendre transformation is found 
in mechanics, namely, the transformation from the Lagrangian to the 
Hamiltonian formulation. In fact, we see from (1: 10-3) that the definition 


Pr (k=1+1,...,2) (7-16) 


H(p,, ~+ +5 Dns |is--- >Gn) = DP 43 a L(qi, . mis >Gno 15 o- ® Gn) 
] 


shows that, except for a minus sign, the Hamiltonian function is the 
Legendre transformation of the Lagrangian which replaces the g,; by the 
p; = @L/0q; as the independent variables. 


7-3 Exact and inexact differentials 


Suppose we consider a general expression for a differential quantity dz 
of the form 


dz = X(a, y) dx + Y(zx, y) dy (7-17) 


where X and Y are given functions of x and y. When we come to cal- 
culate the total change by integrating (7-17) between the initial point 
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y y 


(x2, y2) 


(x1,¥1) (x1,41) 


(a) (b) 
Fig. 7-1 


(x,, y,) and the final point (x2, y.), we find that this change depends, in 
general, on the specific path in the zy plane which is used to go between 
the two points. The reason is that only for a specific path can y and dy 
be expressed as functions of x and dz, so that X, Y, and dz of (7-17) can be 
integrated in terms of the single variable x. Two such paths are illustrated 
in Fig. 7-1la. 

However, if there should actually exist a definite function z(x, y) whose 
differential is given by (7-17), the value of z would depend only on the 
coordinates (x, y) of a given point and not on how the point was reached. 
In this case the integral of (7-17) must have the same value independent of 
the path of integration, namely, 


2 
| Ane CMe eee Ces (7-18) 


Under these circumstances, the quantity z is then called a point function 
or state function and its differential dz is known as an exact differential. 
In other words, an exact differential is the differential of an actual function 
of position whose value depends only on the coordinates of the point. 

If the integral is taken over a closed path, such as that of Fig. 7-10, 
the initial and final points coincide and we see from (7-18) that the integral 
over a closed path of an exact differential vanishes; that is, 


$ dz =0 (7-19) 


Examples of exact differentials are those of mechanical potential energy 
dV and the scalar potential d¢é of electromagnetism. 
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The fact that dz is an exact differential can be stated somewhat dif- 
ferently since, if 2 = 2(z, y), then (7-17) can also be written 


dz = X(a, y)dx+ Y(2z, y) dy = (=) dx + (=) dy = (7-20) 
Ox/y Oy x 
so that 


dz | 
X(x, y) = (=). Y(z,y) = (=) (7-21) 
Ox/y Oy x 
Because of the equality of mixed second partial derivatives, 0?z/dx dy = 
0?z/dy Ox, we see from (7-21) that we must have 


(5, 7 e} (7-22) 


as a necessary condition that dz as given by (7-17) be exact. It can also be 
shown that (7-22) is a sufficient condition that dz be exact and, therefore, 
that there exists an actual function of position z(z, y) whose differential is 
given by (7-17). 

If the differential is not exact, it is called an inexact differential and is 
very often written in the notation dW. The most common example of an 
inexact differential is the increment of work since forces can be non- 
conservative, such as frictional forces or dissipative forces of any type. 


Exercises 


7-1. Verify that (7-7) and (7-10) are correct for the following examples: 
z = ax* + by, 27 = xy, z = cx®/y3, where a, b, c are constants. 

7-2. Which of the following are exact differentials: dz =xdx + y dy, 
dz =ydx +xdy, dz = (dzx/y) + (x/y”) dy? For those which are not exact, 
find the integral of dz counterclockwise around the closed path formed by the 
square whose corners (x, y) are at (0, 1), (1, 1), C1, 2), (0, 2). 

7-3. What relation does (7-22) have to the vector identity curl grad A = 0 
and to the condition that a mechanical force be derivable from a potential 
energy ? 

7-4. Show that dz = y dx + (x + 2y) dy is exact. Find 2(1, 2) — 2(0,0) by 
integrating dz along the two paths: (a) the line segment y = 0 tox = | and the 
line segment x = 1 to y = 2; (6) the line segment x = Oto y = 2 and the line 
segment y = 2 tox = 1; and thus show directly that the same result is obtained 
in both cases. Do the same for any other two paths of your own choosing. 
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& Temperature, heat, and related concepts 


Thermodynamics is basically an empirical subject which deals with the 
macroscopic properties of matter. The principal results of thermo- 
dynamics are in the form of relations among experimentally determined 
quantities. The theoretical prediction of the absolute magnitudes of these 
quantities is regarded as being outside the scope of thermodynamics, 
being left to theories, such as kinetic theory and statistical mechanics, 
which deal with the atomic and molecular structure of matter. The 
development and application of thermodynamics depend very strongly on 
the ideas of equilibrium and the thermodynamic state of a system; we shall 
develop and clarify these in more detail as we go along. 


8-1 Temperature 


A completely new concept which is introduced by thermodynamics is 
that of temperature. Everyone has a qualitative feeling for hotness or 
coldness, and this can be used initially as a sort of crude thermometer. 
From long experience, one concludes that, when a system (pail of water, 
table, room, etc.) is observed to be in thermodynamic equilibrium, that is, 
when no gross variations are noted in its macroscopic properties over a 
period of time, the system has the same temperature everywhere. Similarly, 
when two systems have been in thermal contact for a long time, that is, 
they have been able to affect each other’s temperature, and it is finally 
decided that they are in mutual equilibrium, it is found that they have the 
same temperature. Hence it is concluded that equality of temperatures is a 
necessary condition for equilibrium between two systems. 

One can also conclude from these observations that temperature is a 
property of a thermodynamic state or is a state variable; that is, tempera- 
ture is one of the parameters whose numerical values are needed to specify 
the state of a system. These conclusions are often summarized in the 
following two-part statement, called a /aw, for historical reasons, although 
we Shall state it in the form of a postulate: 


7ZEROTH LAW OF THERMODYNAMICS. There exists a scalar state 
variable called temperature. Equality of the temperature is a necessary 
condition for thermodynamic equilibrium between two systems or 
between two parts of a single system. 
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We shall generally represent the temperature by JT although we will 
occasionally use ¢ or 9; since it is characteristic only of the state, we know 
that its differential is exact and hence we can write d7 as the temperature 
difference between two closely similar states. 

By using the zeroth law, we can devise a method of measuring the 
temperature, that is, of being able to assign a numerical value to it. In 
order to do this, we need only choose any property of a system which 
clearly changes with temperature—such as the length of a metal rod or the 
pressure of a gas in a container—and assign numbers to the temperature 
according to the magnitude of the chosen property. This has been 
generally done by first assigning arbitrary numerical values to two 
convenient calibration points and then subdividing the interval between 
them in a uniform way. Two common calibration points are the freezing 
and boiling points of water under atmospheric pressure; these are 
assigned the respective values of 0 degrees and 100 degrees to obtain the 
Celsius scale of temperature (previously called centigrade), or 32 degrees 
and 212 degrees to obtain the Fahrenheit scale. 

If one carries out this procedure for two different materials, for example, 
and assigns the same numbers to the calibration points, one generally finds 
that at intermediate temperatures these two thermometers will give 
different numerical values for the temperature of the same body. In other 
words, temperature scales defined by the use of different materials or 
properties or both will agree only at the calibrating points. One could, of 
course, simply arbitrarily choose one material and one property to define 
a standard temperature scale, but, fortunately, there is a more satisfactory 
solution for this difficulty. We can get almost exact agreement over a 
considerable temperature range if the material used is a confined gas 
which ts as far as possible from its boiling point. Helium or hydrogen is 
generally chosen for this purpose, and the pressure at constant volume is 
what is usually measured. We shall discuss this temperature scale in more 
detail later, and we shall find that this particular choice helps us to define a 
natural zero for our temperature scale. 


8-2 Thermodynamic systems 


A terminology which we shall constantly use is that of a thermodynamic 
system, which means that in order to specify completely the state of the 
system it is necessary to give its temperature as well as the other mechanical, 
electrical, magnetic, etc., variables which may be required. The simplest 
example of a thermodynamic system, and one which we shall frequently 
discuss, is a homogeneous fluid. The fluid has one mechanical degree of 
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freedom, the volume V; it also has a thermal degree of freedom, the 
temperature 7. In addition, one generally gives the pressure p exerted by 
the fluid, so that the three variables p, V, T are deemed sufficient to 
specify completely the state of the system. However, it is found by 
experiment that only two of these three variables are independent so that, 
for example, we can regard p as a function of 7 and V and write 


p= pT, V) (8-1) 


Such an equation as this is usually called the equation of stare, and it must be 
determined by experiment for each system of interest as it is, of course, 
not given by thermodynamics. 

For a given system, the number of variables needed to specify the 
thermodynamic state depends somewhat on the information desired and 
on the particular system. For example, if our system were a tank filled 
with oxygen gas, p, V, T would be sufficient to answer most questions 
about the state of the gas. However, if we wanted to know the total 
magnetic moment of this paramagnetic gas, these three would be insuffi- 
cient to specify the state and we would have to introduce the magnetiza- 
tion M and the induction B as well, leaving it to experiment to determine 
which of the variables p, V, 7, B are significant in determining the value 
of M. Thus this example shows that the number of variables for a given 
system can change as requirements change, and, in a sense, one depends on 
experiment to determine how many are really needed to answer a given 
type of question. 

Many quantities of interest which can be determined experimentally 
with comparative ease are the various rates of change of one variable with 
respect to another. We can define three important ones as follows: 


.) L (24) : (22) 
~1 (av) . ~_ 1 (Vv) ,_1 (a 8-2 
: (5) ‘ V \apl =o \arh 2) 


a is called the coefficient of thermal expansion, and xy is the isothermal 
compressibility; B does not have a commonly used name. We see that 
these coefficients could be calculated if the exact form of (8-1) were known; 
conversely, measurements of these quantities would aid us in determining 
the equation of state. 

These three coefficients are not independent, for we see that we can 


write (7-10) as 
foes Be 3) 
OT/v \OV Pp Op T 
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so that when the derivatives given by (8-2) are inserted into (8-3) we find 
that 


p= — (8-4) 
pry 

Thus one of these coefficients can always be found from measurements of 
the other two. However, the principal importance of (8-4) is that it is an 
excellent example of the type of result which thermodynamic methods 
yield, that is, a relation among macroscopic, empirical quantities without 
any information about their absolute values. The fact that the value of a 
particular quantity can be obtained indirectly and exactly in this way is 
clearly of great help if it is difficult to measure directly. 


8-3 Work 


If we recall the definition of work used in mechanics, which is 


Ww = [Feds (8-5) 


as given by (I: 4-11), it is clear that work cannot be a state variable of a 
system since it is a characteristic of the process as well as the states involved. 
In other words, we cannot say that a given system possesses a definite 
amount of work for a given condition; we can only tell how much work is 
done when the state of a system is changed. In order to see this in another 
way, let us consider the possibility that we can subject our system to a 
cyclic process; that is, we begin with the system in a certain state, change 
it to one or more other states, and finally return it to the initial state. In 
general, the net amount of work done in such a cycle will be different from 
zero if for no other reason than the existence of frictional or other non- 
conservative forces; however, if W were a state variable AW would be 
zero because the initial and final states are identical. Therefore the work W 
is not a state variable, and its differential dW is not exact. 


ee 
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V 
Fig. 8-2 


We shall adopt the sign convention that work done on the system is 
positive, and work done by it is negative. 

We can easily relate the work done dW to changes in the state variables 
for the important case in which our system is a fluid. From the definition 
of pressure as force per unit area and with the help of (8-5) and Fig. 8-1, we 
find that the work done in a change in volume dV is 


dW = —[F-ds — -{r da-ds = —p|da-ds = -—pdV (8-6) 


In spite of its appearance, dW = —p dV is not an exact differential, as can 
be seen by considering a cyclic process such as that represented in the pV 
plane in Fig. 8-2. Since the area enclosed by the loop is different from 
zero, and since this represents the net work performed during the process, 


we have 
p dW = —$ pav #0 


so that from (7-19) we see again that dW is inexact. 


8-4 Heat 


Another important concept which we shall have frequent occasion to 
use is that of heat. The general idea of heat is familiar to us from such 
observations as are involved in putting a pan of water over a flame and 
finding that it warms up. A natural interpretation of the observed rise in 
temperature of the water is that a definite quantity of something, which we 
call heat, has been transferred from the hot stove to the cold water. If we 
then pour the hot water into cold water, we find that when the final 
equilibrium state is reached it has an intermediate temperature. This 
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process can be similarly interpreted as a transfer of heat from the warmer 
system to the cooler system with the result that the warmer system cools 
off while the cooler one warms up. 

It is possible to reduce the measurement of heat to that of the measure- 
ment of temperature by an appropriate definition of the unit. The unit of 
heat is called a kilocalorie (or kilogram calorie) and is usually defined as 
the heat required at atmospheric pressure to raise the temperature of one 
kilogram of water the one degree from 14.5 to 15.5 degrees Celsius. 
Experimental results then show that the amount of heat required to 
change the temperature of a system through a given range depends on the 
system; for instance, on its mass or on the material of which it is com- 
posed. This leads to the idea of a heat capacity C of a system which, for 
the moment, we define somewhat crudely as the ratio of the heat added to 
the system AQ divided by the subsequent change in temperature AT; 
thus 

ane (8-7) 
AT 
One finds experimentally that the heat capacity is proportional to the 
amount of material involved; hence it is convenient to define: 


Specific heat = heat capacity per unit mass 
C 
= Cm =—-— 8-8 
M — 


where M is the total mass of the system, and 


Molar heat capacity = c = (8-9) 


ela 


where v is the number of kilomoles of the material. A kilomole (or 
kilogram mole) of material is defined as a quantity whose mass in kilo- 
grams equals its molecular weight w or, equivalently, as the mass of a 
number of molecules equal to Avogadro’s number L(6.025 x 107° molecules 
/kilomole). For example, a kilomole of oxygen (O,) contains 32 kilograms, 
while a kilomole of H,O contains 18 kilograms; both samples contain the 
same number of molecules. Thus we have 


yo -—=— (8-10) 
where N is the total number of molecules. As in (8-9), we shall let the 


value of a quantity which is evaluated for a kilomole be called the molar 
value. 
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Since the quantity of heat AQ involved in changing the state depends on 
the process as well as on the initial and final states, it is clear from (8-7) 
that the heat capacity of a system is not a unique attribute of a system; 
one must distinguish among the heat capacities associated with the 
different processes which may be used to transfer heat to and from the 
system. The two most important examples are the heat capacity at 
constant volume, C,,, and that at constant pressure, C,, which are defined 


by 


Whether these distinctions are quantitatively important depends on the 
system involved and must be determined from experiment. 

Heat can be added to a system by mechanical means as well, as is 
illustrated by the rise in temperature due to frictional forces such as in the 
proverbial rubbing together of two sticks to start a fire. In all these 
processes involving friction, it is found that the amount of work done AW 
has a definite relation to the quantity of heat produced AQ, regardless of 
the conditions of the experiment. The best early experiments on this 
effect were done by Joule, and his and later results can be summarized by 


AW =JAQ (8-13) 


where J = 4.186 x 10® joules/kilocalorie and is called the mechanical 
equivalent of heat or the heat equivalent of work. We need not carry the 
symbol J along in all our equations, for when work and heat appear in the 
same equation we need only give them both in the same unit, joules or 
kilocalories as may be convenient. 

We can conclude from the preceding discussion that the heat supplied 
to a system is not a unique function of the state since it depends on the 
process as well, as illustrated by the inequality C, # C, and by the fact 
that the heat need not be added as heat but, for example, can be added by 
mechanical means involving friction or electrically by a current in a 
resistance. Thus a small quantity of heat added to a system is not an 
exact differential and must be written dQ. This, of course, means that a 
heat function or state variable Q, which is a definite function of the state 
of the system, does nor exist. This result is analogous to the recognition 
that the question of how much rain there ts in a lake is a meaningless one, 
although in principle one could always measure the amount of water 
which was added to the lake as rain during a given interval. 
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8-5 Equation of state of an ideal gas 


As an example of a particular thermodynamic system, we shall consider 
the important system called an ideal gas. As the name implies, no real gas 
can be exactly treated as ideal, but, instead, an ideal gas represents a 
limiting behavior which is more closely approached by all real gases the 
less dense and the farther from their boiling points they are. For the 
usual conditions, helium is the nearest approximation to an ideal gas which 
we have. 

The form of the equation of state can be deduced from a series of 
famous experimental results. 


Boyle’s law 
For a given mass of gas and constant temperature, 
pV =const. (8-14) 


We can conclude from this result that the constant in (8-14) may be a 
function of the mass and the temperature. 


Charles’ law 


For a given mass of gas, and a suitable choice of temperature scale, the 
constant in (8-14) is proportional to the temperature; hence we can write 


pV = KT (8-15) 


where K is another constant which is the same for all gases. In order to see 
how this result can be related to experiment, we use (8-15) in (8-2) and 
find that 


a= B (8-16) 


= 

T 
which shows that (8-15) implies that these coefficients are independent of 
the gas and depend only on the temperature. These conclusions are 
verified by experiment. We can use the results (8-16) to define the tempera- 
ture scale 7, known as the gas thermometer scale, by the equation T = I/a. 
If we choose the unit of this scale to be the same as the Celsius degree, we 


Part Two. Thermodynamics 73 


shall have the relation 


~~ 
I 
+ 


T= (8-17) 


: 
a 
where ¢ is the Celsius temperature. We see from (8-17) that the additive 
constant 7, relating the two scales can be found from experiment as 


i —_— 273.15 degrees (8-18) 


X)r-0 


In order to evaluate K, we turn to a third result. 


Avogadro’s law 


Although this law can be stated in various ways, it will be sufficient for 
our purposes to say that, under identical conditions, all gases have equal 
molar volumes; that is, the volume of a kilomole, v, is independent of the 
gas. Referring (8-15) to one kilomole, we have 


pv = KmoieT = RT (8-19) 


where R = K,,,), is called the universal gas constant. We can evaluate R 
by measuring the rest of the quantities in (8-19) for any arbitrary state; 
the results are generally quoted for standard conditions: 


Po = | atmosphere = pressure of a column of mercury 
0.76 meter high 
= 1.013 x 10° newtons/(meter)? 
Ty = 273.15 degrees (i.e., 0°C) 
Vo = 22.4 liters = 0.0224 (meter)? 


so we can find from (8-19) that 


R = 8.31 x 10% joules/kilomole-degree = 1.99 kilo- 
calories/kilomole-degree (8-20) 


Since v = V/» we can write (8-19) as p(V/v) = RT or 
pV = vRT (8-21) 


which is the ideal gas equation of state and, when written as p = »RT[V, is 
in the general form originally given in (8-1). 

As a simple application of this equation of state, we can use (8-21) to 
verify the empirical result known as Dalton’s law of partial pressures: 
the pressure exerted by a mixture of ideal gases is the sum of the pressures 
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which each gas would exert if it occupied the total volume. If, is the mole 
number of the ith gas, » = », + », +--+ and (8-21) becomes 


PV = (4 + M2 ee )RT = », RT+ RT + °° =p Vt pV +--- 


where p, is the pressure which would be exerted by the ith gas if it alone 
were present; therefore 


p=patprte (8-22) 
which agrees with experiment. We also see that we can write 
Yi py; 
Pi = (2) aaa aaa aan ha (8-23 
v Mt+Mm+°': 


where z; 1s called the mole fraction of the ith gas. 


Exercises 


8-1. Verify that (8-4) is correct for the example of an ideal gas. 
8-2. Show that (0a/0p)p + (o«7p/0T), = 0. 
8-3. It is found for a certain system that « and «7 can be written in the forms 


a = vRipV + valVT*, xp = (0T/V)f(p) 


where a and R are constants and f(p) is some function of the pressure. What 
sort of experimental conditions would result in an expression like that for «7 
which involves an undetermined function? Find f(p) and V = V(p, T). [Hint: 
Use (7-22) and (8-2).] 


9 The first law of thermodynamics 


The first law represents a generalization of the results of wide and varied 
experience. We shall state it in the form of a postulate consisting of 
essentially two parts, a definition and a characteristic property. 


FIRST LAW OF THERMODYNAMICS. Every thermodynamic system 
possesses a state variable called the energy U whose increase dU equals 
the heat dQ added to the system plus the external work dW performed 
on it. The energy of an isolated system is constant. 


Since U is a state variable, its differential is exact, as indicated above. 
The first law when stated in quantitative differential form is 


dU = dQ + dW (9-1) 
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and, therefore, in a cyclic process, 


$ dU =0 (9-2) 


The first law combines the recognition of the fact that heat is a form of 
energy with the principle of the conservation of energy. This principle is 
thereby generalized from its previous form which was limited to mechani- 
cal and electromagnetic energy. 

Historically, the first law usually was stated in a negative way: It is 
impossible to construct a machine which operates in a cycle and does a net 
amount of work on the surroundings without obtaining energy from an 
external source. In this form the first law assures us that it is impossible to 
build a perpetual motion machine of the first kind, that is, one which 
violates the energy conservation principle. However, it is important to 
realize that the first law is more than simply a restatement of the conserva- 
tion of energy because of its assertion that the energy U 1s a well-defined 
function of the variables used to describe the state of the system. In fact, U 
must be a single-valued function, for, if it were not, we could use the 
multiple-valued property to construct a machine to create energy by 
operating it in an appropriate cycle. 

If we consider a fluid system of fixed mass whose variables are p, V, T, 
and of which only two are independent because they are related by the 
equation of state (8-1), we see that U is therefore a function of only two 
variables and we could choose any pair; that is, we could write U(7, V) or 
U(T, p) or U(p, V). The first law then tells us that, for example, 


‘ 


oU' (= 
dU = |—] dT — |) dv 9-3 
(=) a7 OV /r ee 


is exact. Fora fluid system we can substitute (8-6) into (9-1); then we have 


dQ = dU + pdV (9-4) 


9-1 Heat capacities 


As an example of the application of the first law, we shall see what it can 
tell us about the heat capacities we previously introduced as essentially the 
amount of heat required to change the temperature of the system of 
interest by one degree. The two most important cases correspond to 
processes at constant volume or at constant pressure; in the first, no 
external work is done, whereas work is involved in the second. 
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From (8-7) and (9-4), we see that the general heat capacity can be 
written 
_ aU + pdV 
dT 


It is convenient to regard U = U(T, V) so that, for a constant volume 
process for which dV = 0, we find from (9-3) that dU = (dU/0T),, dT and 


then (9-5) yields 
dT/v OT/v 


which we can use to calculate C,, if we know the dependence of U on T and 
V; conversely, if we measure C, we learn about the temperature de- 
pendence of the energy. 

The comparable calculation for the constant pressure process (dp = 0) 
is more involved because we need to use both terms of (9-3). Thus we find 
from (9-4) that we can write, in general, 


#0 = (5r),ar+ [(5y), + 2] 4” 


= C,dT + | (2) rn P| dV (9-7) 


C (9-5) 


Since this is not directly applicable to our constant pressure case in this 
form, it is convenient to write the equation of state in the form V = V(T, p) 
so that 


‘ 


aV OV 
dV = (~) dT + (=) d 9-8 
OT» Op T , ( 
If we now set dp = 0 in (9-8) and substitute the result into (9-7), we can 


identify the coefficient of d7 as C,; that is, 


= (B)- 6+ [rh eller), 9 


Since it is generally more convenient to keep the system at constant 
pressure in the laboratory than to keep it at constant volume, C, is easier 
to measure than is C,. As we shall find later, however, C, is easier to 
calculate theoretically; thus the important result (9-9) is very useful in 
the comparison of theory and experiment. 


Example. Ideal Gas. From the equation of state (8-21), we find that 


(=) - (9-10) 
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In order to evaluate the term in (9-9) which comes from the volume 
dependence of the energy, we must turn to direct experiment at this 
stage, although we shall see later that there is another way of evaluating 
(0U/0V)7. For now, we shall simply state the experimental fact: The 
energy of an ideal gas is a function of the temperature only, that is, 
U = U(T) (we shall justify this result in several ways, both in this 
chapter and later). If such is the case, 


oU 
(~) = 0 (ideal gas) (9-11) 


and, if we substitute (9-10) and (9-11) into (9-9), we can obtain the 
important result 


C226 ape aOR (9-12) 
p 
or, for one kilomole, 
C,—C,=R (9-13) 


(Unless we state otherwise we shall always use lower-case letters to 
represent the kilomolar values of quantities which are proportional to 
the amount of material; such general quantities are called extensive.) 
The relation (9-12) is quite well confirmed by experiment. 

Since U = U(T) for the ideal gas, 


au) dU 
= |j- =e = T - 4 
Cs ( v aT CAT) on) 


so that the heat capacity is also a function only of the temperature. If we 
integrate (9-14), we can therefore say that 


U(T) = | "C(T") dT’ + U, (9-15) 


which generally cannot be evaluated until the dependence of C,, on 7 has 
been determined. The constant of integration Up is called the zero 
point energy; from the point of view of our thermodynamic theory we 
cannot evaluate it further and, since only energy changes are of real 
importance, it is often customary to ignore U, by setting it equal to zero. 


9-2 Enthalpy 


We define a new state variable H called enthalpy by 
H=U+4+ pV (9-16) 


We shall see later why it is defined in this particular way. Since H is a 
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state variable, its exact differential is given by 
dH =dU+ pdV + Vdp=dQ+ Vdp (9-17) 
with the use of (9-4). 
In a constant pressure process, dp = 0, and we see from (9-17) that 
dH = (dQ), (9-18) 


Thus dH is the quantity of heat transferred to the system of interest from 
an external source during a process at constant pressure; for this reason 
enthalpy is often called the heat function because of the practical importance 
of constant pressure changes. If we write H = H(T, p) so that 


dH = (4) oe () dp (9-19) 
OT /> Op T 
then, when dp = 0, 
0H | 
dH = (==) dT = (dQ), (9-20) 
and we find from (8-12) that 
: aH) 
= (— 9-21 
(5 p oo 


which is a result similar to (9-6) and in principle provides us with a method 
of calculating C, by differentiation of a single function. 
For an ideal gas, pV = »RT and therefore 


H = U+ »RT = H(T) (ideal gas) (9-22) 
because of (9-11). 

We shall have much use for enthalpy in other connections later, but for 
the present we shall simply indicate very briefly its importance in engineer- 
ing. It is directly related to the energy flow involved in steady state 
processes where work is performed, as, for example, in turbines. We shall 
refer our considerations to one kilomole of the material which is passing 
through such a mechanism, as illustrated schematically in Fig. 9-1. The 
inlet pipe has a cross section A,; when a kilomole of volume 2, enters, the 
fluid will have been displaced a distance (v,/A,). Since the pressure is p,, 
the force on this fluid is p,A, and the total work done on the fluid was 
(v,/A,) X (p,A,) = pity. In addition, the entering fluid had an energy u, 
so that the total energy flux into the machine per kilomole is u, + pyv, = 
h,. Similarly, the molar energy flux out is u, + pov. = hy. If the work 
done per kilomole by the machine on its surroundings is —w while the 
heat g is consumed, then because the total energy in must equal the total 
energy out, since the machine is operating under steady state conditions, 
we must have h, + g = h, — w, or 


—_—wW = q + h, —s h, (9-23) 
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Fig. 9-1 


Thus the relation between the input and the output of the machine has a 
very simple connection with the initial and final states of the fluid and 
does not refer to whatever specific processes may be occurring inside the 
apparatus. 


9-3 Reversible and irreversible processes 


Before we discuss particular processes in more detail, it is necessary to 
emphasize the important general distinction between reversible and 
irreversible processes. By a reversible process we mean one which, at any 
stage, can be reversed and the system of interest brought back to its 
initial state without producing any lasting changes of any sort in the 
system itself, or in its surroundings. In practice, one generally means 
that this idealization is a quasi-static process which is defined as a sequence 
of equilibrium states or, more accurately, as a sequence of states in which 
at any time the system is only infinitesimally away from a true equilibrium 
State. 

The reason for the term “‘quasi-static’’ is that most imaginable processes 
of this type require that all velocities involved approach zero; hence all 
changes are produced infinitely slowly. For example, if our system is a gas 
which is allowed to expand by our decreasing the external pressure, this 
can be done quasi-statically and reversibly by our keeping the external 
pressure only infinitesimally lower than the gas pressure. The movable 
piston which confines the gas in the container will then move with only 
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an infinitesimal speed, so that the friction between it and the walls will be 
negligible; also, no turbulence will be produced in the gas as a result of 
portions of the gas receiving finite velocities and thus there will be no 
dissipative viscous effects. As an over-all result, the gas will always be 
arbitrarily close to equilibrium during the whole expansion. We see that 
we can also describe this particular reversible process as one which may be 
made to reverse and proceed in the opposite direction by making an in- 
finitesimal change in the parameters (in this case, by altering the pressure) ; 
if this is done, it will go through the same intermediate equilibrium states, 
but in the reverse order. Although a reversible process generally requires 
that everything be done infinitely slowly, this is not a sufficient condition; 
for example, the discharge of a condenser through a very large resistance 
will occur very slowly, yet the energy is dissipated as heat in the resistance, 
thus making the process irreversible, as we shall see in detail in the next 
chapter. 

The principal importance of reversible processes in thermodynamics is 
that we can write definite equations to describe every part of the reversible 
process; the reason, of course, is that our thermodynamic equations 
apply only to equilibrium states and the system is always arbitrarily close 
to equilibrium in the sequence of intermediate stages between the initial 
and final states of the system. 

On the other hand, natural processes encountered in practice are 
always irreversible and represent processes in which disturbed equilibria 
are being equalized by the flow of heat across finite temperature differences 
or by work being done across finite pressure differences; these dissipative 
effects are generally described macroscopically as friction, viscosity, 
hysteresis, resistance, etc. Since an irreversible process does not include 
equilibrium states during its intermediate stages we cannot describe such 
processes by equations, although we shall see that some aspects of irrever- 
sible processes can be described by inequalities. Of course, the initial and 
final states of an irreversible process are equilibrium states and we can 
write appropriate equations for each of them. 


9-4 Reversible adiabatic and isothermal processes 


The only specific processes we have considered up to this point are 
those for which p or V is held constant, although there are other possible 
processes for which other quantities are kept fixed. An important example 
is an adiabatic process which is defined as one in which no heat is added to 
or taken from the system, so that dQ = 0; we shall see in the next chapter 
precisely what variable remains constant in this process. An adiabatic 
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process can be fairly well approximated in practice by enclosing the 
system in a good heat insulator. 
Setting dQ = 0 in (9-4), we find that, for a fluid, 


dU + pdV=0 (9-24) 


and, if we were to write U = U(p, V) and substitute the resulting dU into 
(9-24), this equation would describe a family of curves, called adiabatics, 
in the pV plane. 


Example. Ideal Gas. From (9-14), we find that dU = C,,dT; hence 
(9-24) can be written 
C,dT + pdV=0 (9-25) 


In order to put (9-25) into the form in which p and V are the only 
variables, we use the equation of state (8-21); from it we find that 


dT= pdV + Vdp (9-26) 
vR 

If we substitute (9-26) into (9-25) and use (9-12), we obtain 

C,pdV+C,Vdp=0 (9-27) 
If we define 
Cy. Cs 

=—?=- 9-28 
ee (9-28) 


and divide (9-27) through by pVC,, we find that the adiabatic condition 
dQ = 0 can also be written for an ideal gas as 


2 +>(<) =o (9-29) 
p V 


If we assume for simplicity that y = const., we can integrate (9-29) to 
find that In p + y In V = const. = In(pV’”), or 


pv” = const. (9-30) 


which is the equation describing an adiabatic process for an ideal gas. 
By using (8-21), we can also write (9-30) in terms of the other two pairs 
of variables; the results are that 


TV’-1 = const. and Tp~Y/” = const. (9-30’) 


Another important example is an isothermal process for which the 
temperature is kept constant and thus dT =0. For an ideal gas the 
equation of the curve is simply obtained by setting 7 equal to a constant 
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(pi, Vi) 


Fig. 9-2 


in the equation of state (8-21); thus 
pV = const. (9-31) 


The slope of an adiabatic can be obtained from (9-29) as 
eae Pe (9-32) 


while from (9-31) we find that p dV + V dp = 0 and thus the slope of an 
isothermal is found to be 


OD (9-33) 
; dV V 

Since y > 1 because of (9-12), the magnitude of the slope of an adiabatic 
must be greater than the magnitude of the slope of an isothermal, so that 
the adiabatics are steeper curves than are the isothermals. This is illus- 
trated by Fig. 9-2 which shows the course of two expansions beginning at 
the same initial state (p;, V;); we are able to illustrate these processes 
as specific curves because the processes are reversible. We cannot, 
of course, plot intermediate stages of an irreversible process. 


9-5 Free expansion 


A free expansion is an example of an irreversible adiabatic process. 
Let us consider the experiment illustrated in Fig. 9-3 for which the appara- 
tus consists of a well-insulated container which is divided into two parts. 
One portion is filled with a compressed gas, and the other portion is a 
vacuum. If we now quickly remove the partition, the gas will rush into the 
evacuated part, and eventually the gas will settle down to a new equilib- 
rium state in which it fills the whole container. The temperature of the gas 
is measured both before and after the expansion. 
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Fig. 9-3 


The initial and final states are equilibrium states (although the inter- 
mediate ones obviously are not); hence we can write (9-1) as 


AU =AQ + AW (9-34) 


Since the box is well insulated, the process is adiabatic and AQ = 0; 
the gas expands against no external pressure, hence it does no work and 
—AW = 0. Therefore 

AU =0 (9-35) 


showing that the internal energy of a gas remains constant in a free 
expansion. This experiment enables us to learn something about the 
dependence of the energy on the volume, for, if U = U(T, V), then 


dU dU 
w= (Uar+Q)are0 ow 
because of (9-35). Because of the conditions of the experiment, AV ¥ 0; 
therefore, if it is found that AT # 0, we see from (9-36) that both the 
partial derivatives must be different from zero and of opposite sign in 
order to satisfy AU = 0. Thus, if a difference of temperature is found for 
the gas, we must conclude that the energy depends on both the temperature 
and the volume. 

It was found, however, that AT = 0 for an ideal gas; therefore (9-36) 
becomes (0U/dV), AV = 0 so that (0U/0V), = 0, which is exactly the 
condition (9-11). Thus this free expansion experiment shows that the 
energy of an ideal gas is independent of the volume; one can similarly 
conclude that the energy is independent of the pressure as well, so that 
U = U(T) as was previously stated in (9-15). 


9-6 Porous plug experiment 
In the free expansion the gas was allowed to gain kinetic energy of mass 


motion. In the porous plug experiment, illustrated schematically in Fig. 
9-4, the gas is made to pass slowly through the plug from compartment 1 
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Plug 


P1> P2 
Fig. 9-4 


to compartment 2 by simultaneous motion of the two pistons. The 
primary purpose of the porous plug is to allow the gas to be slowed down 
without having it do any work on the surroundings once the steady state 
has been attained. Then one can measure the temperature change resulting 
from passage from one state to the other in order to get an idea of the 
energy change without the necessity of considering the effect of the gas 
having temporarily obtained kinetic energy of mass motion. Because the 
whole apparatus is kept well insulated, this is also an adiabatic process. 
Such an arrangement is of exactly the type of steady state situation 
illustrated in Fig. 9-1 and described by (9-23). In this case, g = 0 since the 
process is adiabatic, and —w = 0 since the gas does no external work; 
thus (9-23) becomes 
h, =h, (9-37) 


Therefore the porous plug experiment is a process in which the enthalpy 
of the gas remains constant. 

For the time being, we consider only the application of this result to an 
ideal gas for which the molar enthalpy is obtained from (9-22) ash = u + 
RT. Inserting this expression into (9-37), we quickly find that 


Uo = uy — —R(T, = ) = —R AT (9-38) 


It is found experimentally that AT is very small and is more nearly equal 
to zero the more nearly the gas is ideal; hence we conclude that, for the 
limiting case of an ideal gas, AT = 0 and u, = u, by (9-38). Therefore wu is 
independent of the volume, and we have again found that the energy of an 
ideal gas depends only on the temperature. 


9-7 Another state variable for the ideal gas 


We can use all this information about an ideal gas to obtain an important 
result which suggests the content of the next chapter. Using (8-21) and 
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(9-14), we can write (9-4) as 


dO = C(T) dT + a dV (9-39) 


Now dQ is not an exact differential, but if we divide both sides of (9-39) by 
T we obtain 


dQ _C(T)dT , »RdV 
T T V 


which is an exact differential because the right side is integrable. Therefore, 
if we set 


(9-40) 


dS = i (9-41) 
T 
we can integrate (9-40) between the states (Jo, V,) and (7, V) and we find 
that 
T ’ ’ 
S—S, -| CAT GT eR in (~) (9-42) 
To Yo 


and, if we assume for simplicity that C,, = const., (9-42) becomes 


S—S,=C,In (Z) + oRIn (=) (9-43) 
Ty Yo 

Thus we have found another state variable for an ideal gas since (9-43) 
depends only on the initial and final states and is independent of the path 
of integration. The function S is called the entropy. We can now rewrite 
the first law for an ideal gas in terms of entropy, for, from (9-41), dQ = 
T dS, and if we substitute this into (9-4) we have 


dU = TdS — pdv (9-44) 


which shows that T is conjugate to S in the same sense that —p is to V; 
that is, if we write U = U(S, V), then 


oU oU 
_ EG). a (), ae 


The variables V and S are both extensive variables; that is, they are 
proportional to the quantity of matter in the system. The variables p and 
T, however, are intensive variables; that is, their values are independent of 
the quantity of material. Reversible adiabatic processes are also called 
isentropic since, if dQ = 0, then dS = 0. 

In the next chapter, we shall proceed with a generalization of this result 
and show that every system possesses the entropy as a state variable, so 
that this property is not restricted to an ideal gas. 
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Exercises 


9-1. Eight liters of an ideal gas are initially at a pressure of 4 atmospheres 
and 200°C. The gas is allowed to expand until the pressure is reduced to 
1 atmosphere. Find the final volume and temperature, the work done, and the 
heat absorbed under each of the following conditions: (a) isothermal expansion ; 
(6) adiabatic expansion (y = §); (c) the expansion is into a vacuum. 

9-2. A certain system has the equation of state V = AT + (p/B) and its 
energy is given by U = CT — 3BV* where A, B, and C are constants. Find the 
enthalpy, C,, and C;. 

9-3. Discuss in detail how you could perform a reversible expansion of a gas 
in a cylinder closed off by a movable piston. How would you store the work 
done by the expanding gas as potential energy of the surroundings so that if the 
process were reversed the work could be retrieved without permanently altering 
the surroundings? 


IO The second law of thermodynamics 


We shall proceed as before by stating the second law as a postulate which 
has two parts: a definition and a characteristic property. 


SECOND LAW OF THERMODYNAMICS. Every thermodynamic system 
possesses a state variable called the entropy S whose increase dS equals 
the heat absorbed in an infinitesimal reversible change dQ,,, divided by 
the simultaneously defined ‘‘absolute temperature’ 7. The entropy of 
an isolated system in equilibrium is a maximum. 


Historically, the beginnings of the second law can be found in the work 
of Carnot, who was interested in the study of the possible efficiencies of 
machines, such as steam engines, for converting heat into work. Even 
before the second law was clearly formulated, it was known that mechan- 
ical energy could be completely converted into heat at will, by means of 
friction for example, but that a given amount of heat could only be 
partially converted into work even in machines from which friction had 
been eliminated as much as possible. Accordingly, the early statements of 
the second law were phrased in terms of the impossibility of certain 
machines. One form due to Clausius says: It is impossible to construct 
a device which, operating in a cycle, will produce no effect other than the 
transfer of heat from a cooler to a hotter body. (This form reflects the 
experience that work is always required to transfer heat from a cold to a 
hot reservoir since the natural direction of heat flow is from hot to cold.) 
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Another form is due to Kelvin and Planck: It is impossible to construct 
an engine which operates in a cycle and produces no effect other than the 
extraction of heat from a reservoir and the performance of an equal 
amount of work. (We note first of all that such a machine would not 
violate the law of conservation of energy. This form reflects the experience 
that as yet no one has built an engine which operates without rejecting 
some heat at a lower temperature than the temperature at which heat is 
taken in.) Our approach will be to show that our statement of the second 
law is equivalent to these older statements; it is interesting to note their 
negative aspect in contrast to our more positive postulation form. 

It is clear that it would be of great practical importance if the second 
law were not correct because there are tremendous stores of thermal 
energy available in the oceans and the earth, and all that one would need 
to do would be to cool them off by extracting heat from them. However, 
this cannot be done because it has been found by experience that we 
always require something of still lower temperature to which we can 
reject some heat. It should also be noted that these older formulations 
emphasize the cyclic nature of the process. A little thought, however, 
quickly shows that this is the only feasible way of constructing a machine 
which could continue working indefinitely. For example, if one obtained 
work by using the expansion of a gas to operate a piston, the gas pressure 
would decrease during the expansion until it eventually reached atmos- 
pheric pressure and then the whole process would cease. In order to 
continue getting work, the gas would have to be compressed again, thus 
producing a cycle. A machine which would operate in violation of the 
second law is called a perpetual motion machine of the second kind. 

We shall not show the precise equivalence of the Clausius and Kelvin- 
Planck statements of the second law, although it is reasonably clear that 
they refer to essentially the same sort of thing. Instead we shall proceed 
directly to the rather lengthy and indirect process of showing their equiv- 
alence to our statement about the existence of entropy; we begin with 
the consideration of a special engine devised by Carnot. 


10-1 The Carnot cycle 


Let us assume that the working part of our engine is an arbitrary but 
homogeneous fluid. Thus we can define its state completely by giving 
only the two mechanical variables pand V; the temperature 0, as measured 
on some convenient, arbitrary empirical scale, can then be determined 
from the equation of state. The Carnot cycle through which we imagine our 
fluid taken is illustrated in the pV plane of Fig. 10-1 and consists of the 
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Fig. 10-1 


two isothermal processes (1 > 2 and 3-4) and the two adiabatic 
processes (2 — 3 and 4 — 1) so that the system is finally brought back to 
the initial state 1. All processes are assumed to be reversible. As the 
system traverses the isothermal 1 — 2, it is necessary to add the heat Q, 
to the system from an external heat reservoir (like the boiler of a loco- 
motive) at the constant temperature 6,, while, on the other isothermal 
3 — 4,,an amount of heat Q, is rejected by the system to a cooler reservoir 
(such as the atmosphere) at the temperature 6,. The net amount of heat 
added is 


AQ =7,-—Q, (10-1) 


The total work W, done by the system on the surroundings is 
W. = —-W= o paV (10-2) 


which equals the area enclosed by the cycle on the pV plane. Since the 
system returns to its initial state, the change in energy is zero according 
to (9-2); hence AQ + W= AQ — W, = 0, or 


W. = Qi — Q2 (10-3) 


We define the efficiency y of the cycle as the ratio of the work done by the 
system to the heat input, so that 


Ls 


" 1-2 (10-4) 


~ 


with the help of (10-3). 
We now want to show that the efficiency of this engine is independent 
of the properties of the particular fluid which is being used. Let us 
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consider two engines E and E’ which use different fluids but which operate 
between the same heat reservoirs of temperatures 6, and 0, and which do 
the same work W,. The quantities of heat corresponding to E’ are Q,’ 
and Q,’' and its efficiency is 7’. Suppose we assume that 


n> (10-5) 


Let us arrange the engine so that E (which is reversible) operates as a 
refrigerator on the cycle 1 4 3 2 1 of Fig. 10-1 and is driven by E’ so that 
E’ adds the work |W,'| = |W,| to E; this situation is illustrated very 
schematically in Fig. 10-2. From (10-4) we have 7’ = |W,|/Q,' and 
7 = |W,|/Q,, so that we find from (10-5) that Q, > Q,'; in other words, 
the hotter reservoir gets more heat from E than it gives to E’. Since the 
two engines are being operated simultaneously, the net effect of one cycle 
is that this quantity of heat, Q, — Q,’, is transferred from the lower 
temperature 0, to the higher temperature 0, without performing work and 
without making any permanent changes in the two engines or their sur- 
roundings. However, this violates the Clausius statement of the second 
law, and we must conclude that (10-5) is impossible. Similarly, by inter- 
changing the roles of E and E’, we can show that 7 > 7’ is equally 
impossible; hence a consequence of the second law is that 


n= 1 (10-6) 


This says that all reversible Carnot engines which exchange heat only at 
the two temperatures 0, and 6, have equal efficiencies. Therefore 7 can 


Fig. 10-2 
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depend only on these two temperatures, and we see from (10-4) that 
21 — (64, 64) (10-7) 
Q2 


where f is a universal function of the temperatures and is independent of 
the nature of the working fluid. 

We also see now that this conclusion is not restricted to fluid systems 
but applies to any material, because we used only the fact that the Carnot 
cycle involves adiabatics and isothermals. We now want to determine the 
properties of this function f defined in (10-7). 


10-2 Absolute temperature 


Let us consider two reversible Carnot engines (E and E’) operating 
between the two temperatures 6, and 6, as well as using a reservoir of 
arbitrary, but intermediate, temperature 6, so that 6) acts as the cool 
reservoir for one cycle, absorbing the heat Q, and rejecting this same 
amount to the other engine for which it serves as the hot reservoir. This 
process is illustrated in Fig. 10-3. It is clear that this arrangement is, in 
effect, a single engine operating between the temperatures 6, and 6,, 
absorbing the heat Q, and rejecting the heat Q,; thus (10-7) applies to 
the over-all system. If we apply (10-7) to the engines separately, we can 
also write 


Qi _ 6,, 0 Qo _ 4g 0. 10-8 
0, f( ’ o)» 0. f( 0> ) ( ) 


Multiplying together the two equations of (10-8) and using (10-7), we 
find that 
f (61, G2) = f(,, Of (89, 92) (10-9) 


If we consider the special case 6, = 02, then Q, = Qz, so that f(0., 0.) = 1 
by (10-7) and (10-9) becomes 1 = f(62, 49)f(6, 62) and therefore 


1 
f (8., Oo) 


f(9o, 92) = (10-10) 


Fig. 10-3 
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As a result, (10-9) can now be written 


(64, 04) = Eu ee (10-11) 
f (82, Oo) 
which has the appearance of 6, having “cancelled out” in order to make 
the left side independent of 6). Therefore we conclude that it is possible 
to satisfy (10-11) in general by writing fin the form 


$(9;) 
6,, 6.) = = 10-12 
f( ) 40.) ( ) 


where ¢(6) is another universal function, but now only of a single tem- 
perature. With the substitution of (10-12) into (10-7), we find that 


Qs _ (1) (10-13) 
Q. (6s) 


Since ¢ is a universal function of the empirical temperature 0, we can 
use the existence of this function to define a new temperature scale, which 
we shall call the absolute temperature T, by means of the equation 


T = 4(6) (10-14) 


putting off until later the question of exactly how this can be done in 
practice. Combining (10-14) and (10-13), we find that 


Oo (10-15) 
QO. T, 
so that the efficiency as obtained from (10-4) is 
n=1-B-xh (10-16) 
T, T, 


If we want to compare the absolute temperatures of the two reservoirs, we 
see now that in principle all we have to do is to operate a reversible Carnot 
engine between these reservoirs and measure the engine’s efficiency. The 
efficiency will be unity only when 7, = 0; since this would represent the 
complete conversion of heat into work, it is reasonable to call this tem- 
perature absolute zero. 

Before we go on to apply these results to an arbitrary reversible cycle, 
we shall first show that the absolute temperature scale T defined by 
(10-14) is identical with the ideal gas temperature scale (which we tem- 
porarily write as T,) and which is defined by the ideal gas equation of 
state pV = vRT,. In order to do this, let us assume that we are using 
kilomoles of an ideal gas as the working fluid in the Carnot cycle of Fig. 
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10-1, in which we now label the isothermal portions with the temperatures 
Ti, and 7,9. 

For the isothermal processes, dT = 0, so that dU = 0 by (9-14) and 
therefore dQ = p dV by (9-4). Integrating along the isothermal | — 2 and 
using (8-21), we find 


2 2 
QO, | i av=| PRT GY = SRT in (10-17) 
1 1 V V, 
Similarly, 
: V. 
QO. -| p dV = »RT,, In = (10-18) 
4 V, 
and then 
Qi _ (7) In (V2/Vi1) (10-19) 
Q, Tj2/ In (V3/ V4) 


Equation (9-30’) applies to the adiabatic processes, so that for the 
expansion 2 — 3 we have 
TV =F Ve (10-20) 
while for 4 — | we have 


TyeVi a TyVy" (10-21) 
Multiplying (10-20) and (10-21) together, we find that 
Ty Tya(V2Va)”* = Ta T,(VsVi)"* 


and therefore V.V, = V3;V,; this can also be written as V,/V, = V,/V4, 
which leads to 


(10-22) 
When we insert (10-22) into (10-19) and use (10-15), we find that 


Q — fa (10-23) 
QO, The T, 


where 7, and 7; are the absolute temperatures. We can conclude from 
this that 
Tas = T, 


absolute 


(10-24) 


provided that the degree is chosen to be the same for each scale. Accord- 
ingly, we no longer need to decide whether the temperature scale we are 
using is based on (8-21) or on (10-15), and we can simply continue writing 
T for the temperature, which we shall always measure on the absolute 
scale, unless we specifically say otherwise. 
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10-3 Arbitrary reversible cycles and the existence of entropy 


It will be convenient to revert to our normal sign convention for heat 
for which the rejected heat is considered to be negative, so that (10-15) 
becomes Q,/7, = —Q,/T,. If we now apply this basic result to a Carnot 
cycle whose isothermal portions represent infinitesimal changes of state, 
the quantities of heat transferred will also be infinitesimal and we shall 
have dQ,/T, = —dQ,/T, or 

Mi 5 Meg (10-25) 
i Tf, 

Let us now consider an arbitrary, reversible cycle involving an arbitrary 
fluid which can be represented by the closed curve in the pV plane as 
shown in Fig. 10-4. As shown by the heavy saw-tooth line on the figure, 
we can approximate the closed curve representing this cycle as closely as 
we please by using the isothermals and portions of the adiabatics of 
properly chosen, neighboring infinitesimal Carnot cycles. Since (10-25) 
holds for each of the infinitesimal cycles, when we sum these expressions 
up for all the exchanges of heat for the whole reversible cycle, we obtain 
in the limit 

} rev _ g (10-26) 

T 
where we have added the subscript to dQ to emphasize the reversible 
nature of the cycle. According to (7-19) the vanishing of this integral over 
the closed path means that the integrand is an exact differential of some 
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function which we call entropy and write as S, so that 


dS = toeee (10-27) 


For a fluid, where we can use the first law in the form (9-4), we have 
T dS = dU + pdV (10-28) 


The result (10-27) is the basis for sometimes defining the absolute tem- 
perature as the integrating factor for heat. 

Although we have obtained this result only for the special system 
consisting of a homogeneous fluid, it can be shown without too much 
difficulty that (10-26) is also true for any reversible cycle of any system of 
interest and is equivalent to the first part of our statement of the second 
law, where it was asserted that every system possesses entropy as a state 
variable. 

The difference in the entropy of two arbitrary states A and B can be 
calculated by integrating (10-27) to obtain 


B 
Sp == S4 -| aeees (10-29) 
A. 


It is important to remember that the path of integration used in (10-29) is 
completely arbitrary, as long as it is reversible, since the entropy depends 
only on the state; thus the path chosen to calculate S, — S, can be 
chosen solely for convenience, and it need have no relation to the actual 
way in which the system changed its state from A to B. In particular, the 
process A — B may have been completely irreversible, yet (10-29) will, of 
course, give the correct entropy difference. Let us illustrate these remarks 
with a particular example. 


Example. Ideal Gas in a Porous Plug Experiment. Since the process is 
adiabatic, dQ, .tya; = 0 and for the real process 


B 
| dQactual = 0 (10-30) 
4 T 


a 


but this is not the entropy change because the actual process is irrever- 
sible. According to (10-29), in order to calculate the entropy change we 
need a reversible process which will take the system from the same 
initial state to the same final state which, we recall, corresponds to no 
temperature change for an ideal gas; let us accordingly choose an 
isothermal reversible expansion from V., to V;, as our process to be 
used for calculations. 
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Since we are considering an ideal gas and T = const., dU = 0 and 
(9-4) becomes dQ_,, = p dV. We also have pV = »RT = const., from 
(8-21), so that (10-29) becomes 


Ves Ve 
Sp — Sy -| pat = or dV _ Rin (10-31) 
Va Va V V4 
It will be left as an exercise to show that the same result will be obtained 
by integrating along any other reversible path between these two states. 


This example with its contrast between (10-30) and (10-31) clearly 
shows that the existence and value of the entropy of the final state depends 
only on the state and not on whether it was reached reversibly orirreversibly. 
We also note that (10-29) gives us only changes in the entropy, so that the 
actual value of S for a given state is determined only up to an additive 
constant; later, we shall find several ways of deciding the proper value 
for this constant. 


10-4 Irreversible processes and the second part of the second law 


Let us again consider the situation involving the two Carnot engines E 
and E’ which is depicted in Fig. 10-2, but let us now assume that E’ is not 
reversible. If we again assume that 7’ > 7 and repeat all the considera- 
tions following (10-5), we shall again find at the end of one cycle that the 
net effect has been a transfer of heat from the cold reservoir to the hot one. 
Since this is impossible according to Clausius’ statement of the second 
law, we conclude as before that it is impossible to have 7’ > 7. We 
clearly cannot have equal efficiencies, 7’ = 7, because E” is irreversible 
and has dissipative energy losses that E does not have. Thus the only 
possibility is that 

n> (10-32) 


Hence a reversible Carnot engine has a greater efficiency than an irre- 
versible Carnot engine which is operating between the same temperatures 
and producing the same work per cycle. Because of (10-32), we have 
(1 — ») < (1 — 7’) and therefore (10-4) yields 


Q. ~ Qe 
se = (10-33) 
Q, QQ) 
If we use (10-15) to replace Q,/Q, in (10-33), we find that 7,/T, < Q.'/Qy’, 
or that 


Qr - Qe 
ae (10-34) 
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holds for an irreversible Carnot engine. If the engine has infinitesimal 
isothermals in it, we can write (10-34) as (dQ,'/T,) < (dQ,'/T.) which 
becomes ; ; 
Mr, Be <9 (10-35) 
q, T, 
when we again regard rejected heat as negative. 

If we now consider an arbitrary cycle that is completely or partially 
irreversible by subdividing it into a large number of infinitesimal cycles 
exactly as we did after (10-25), then because (10-35) is now applicable 
rather than (10-25) we obtain 


$ dQ — (10-36) 
T 
instead of (10-26). Combining (10-26) and (10-36), we can state the general 
result that d 

“ <0 (10-37) 


with the equality holding only if the cycle is completely reversible. 

Let us now consider two states A and B and construct a cycle in which 
we go from A to B by anirreversible process and from B to A bya reversible 
process. The inequality of (10-37) applies to the whole cycle, and since we 
can use (10-29) for the reversible part we obtain 


p22 = [0 SQicen 4 ("MO _ f" Oierer + 5, 5, <0 
T J; F = oF SE - 


which can be written B dQ 
S,—S,> | Sues (10-38) 
A 


and is applicable to any system. We have already seen an example of this 
result in our discussion of the porous plug experiment in the last section 
where we found the entropy change by (10-31) to be greater than the 
value of the integral given by (10-30). 

In general, we can write 


AQirrey = (dQ°)irrev + (4Q")irrev (10-39) 
where the superscript e refers to heat transfer between the system of 
interest and its surroundings and i refers to internal transfers of heat 


within the system. For an isolated system, (dQ°),,.y = 0, and (10-38) 


becomes Big 
S3—S) >| AO tes (10-40) 
A 


If there are no internal transfers of heat in the system, the integral is zero 
and S; > S,. If there are internal transfers of heat within the system, 


Part Two. Thermodynamics 97 


the integral will be positive, as we can show by considering the typical 
situation in which a quantity of heat Q is transferred by the natural 
irreversible process of conduction from a portion of the system at tem- 
perature 7, to another portion at temperature 7,; the contribution of 
this process to the integral of (10-40) is 


since T, > T, because the heat flow is from the higher temperature region 
to that of lower temperature. Therefore Sp, > S, also when there are 
internal transfers of heat. 

Consequently, for an isolated system, we shall have 


Ss. (10-41) 


in general, except for the case when the system is in internal equilibrium; 
then no irreversible processes can occur, the equality of (10-37) is appli- 
cable, there are no transfers of heat, and S, = S,. In other words, the 
entropy of an isolated system in equilibrium is a constant, and in fact it is 
a maximum since (10-41) shows us that any entropy changes which will 
occur in an isolated system will be increases. Thus we have demonstrated 
the validity of the second part of the second law. By using (10-41) as a 
way of stating the second law, we ascribe a definite sense or direction to 
natural processes; equation (10-41) is also sometimes helpful in deciding 
whether a certain process is spontaneously possible or not by determining 
whether or not the corresponding entropy change would satisfy (10-41). 


10-5 Relation between the energy and the equation of state 


Several times in our discussions we have needed the dependence of the 
energy on the other variables, particularly on the volume. We shall 
reconsider this problem as an illustration of one application of the second 
law. 

Suppose we regard T and V as the independent variables so that we 
can write S = S(T, V). If we calculate dS, equate the result to the com- 
bined first and second laws (10-28), and use (9-3) in order to express 
everything completely in terms of 7 and V, we find 


dS = (5) dT + (5) qv a= dU tea 
OT OV /T T 


| (=) (=) | 
= —| |—] dT —]| dV dV 
| OT |i: a OV /r eae 
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and, therefore, on equating coefficients of corresponding differentials, we 


find that 
ds 1 (22) 
| oS | 10-42 
(5 vy T\0T/v 
S| 1 (2) 
—}) =-—|(— 10-43 
(5; TL Ove ae 
From (9-6) and (10-42), we see that we can also write 
28) 
C, = Ti— 10-44 
. fa V ( 
We also see from (10-43) that (9-9) can be written 
C,-C,= (3) (=) (10-45) 
OV/rt \OT/» 


However, it is possible to put (C, — C,) into an even more useful form. 
We base our considerations on the equality of the mixed second partial 
derivatives as expressed by (7-22) which, for this case, becomes 


eS as 
OVOT oT aV 
so that, when (10-42) and (10-43) are used, we find that 


Srl (geh lh. ~ av Gr) ~ rar (or), ~ ar rl (av) + 


-- allo), +2] +zarl Gr) +2 


which, when the third term is cancelled from the last term, leads to 


sv). ~ 73) 
—} =T\{=—) —- 10-46 
(T T OT/v . ( ) 
Then (9-9) can be written 
2P () 
C,-—C, = Ti=]} {(— 10-47 
= = (FF), OT!» a 


We see that the last two equations are now in such a form that in order 
to evaluate the important quantities on the left we need only know the 
equation of state. On comparing (10-43) and (10-46), we also get the 


interesting result that 
av), = (6r) 
—} = | 10-48 
(5 T OT/v 
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which enables us to learn something about the entropy of a system from 
its equation of state. 


Example. Energy of an Ideal Gas. We find from pV = »RT that 
(dp/0T), = vR/V, which when used in (10-46) gives (@U/dV) 7 = 0 just 
as in (9-11). Again we find that the energy of an ideal gas is independent 
of the volume as we deduced from the free expansion and porous plug 
experiments, except that now we see that it follows rigorously from the 
known equation of state and the formula (10-46) which was derived 
from the second law. 


Example. Relation between the Heat Capacities. The result (10-47) 
which also follows from the second law has been tested by experiment. 
It is usual to express it in terms of more directly determined experi- 
mental quantities. If we use (8-2) to evaluate the derivatives, we find 
that (10-47) can be written 


C, — C, = «BpVT (10-49) 
and, if we use (8-4) to eliminate 8, we obtain 
2 
te ome es (10-50) 
aS 


which is a very general result and is especially useful for discussing 
solids. 

In the case of an ideal gas we have « = B = 1/T by (8-16) and (10-49) 
becomes 


N 


C,—-C,=- =R 
T 
exactly as we found before in (9-12), except that now we can regard it 
as having been obtained more accurately, since we required only the 


knowledge of the equation of state. 


Exercises 


10-1. Show that the Clausius and Kelvin-Planck statements of the second 
law are equivalent. 

10-2. Show, by integrating over at least three different paths in the pV plane 
which have the same values of T at the beginning and end, that the value of the 
entropy difference for an ideal gas is always given by (10-31). 

10-3. Verify that (10-47) is satisfied by the system of Exercise 9-2. 

10-4. It is asserted that the equation of state of a certain system is given by 
pV = DT + Bp and its energy is given by U = CT — (A/V), where A, B, C, 
and D are constants. Show, with the help of (10-47), that this is impossible. 
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10-5. The energy of a system is given by U = bVT‘ and the equation of state 
is given by pV = aU, where a and 6 are constants. Show that a = }. 

10-6. What is the shape of a Carnot cycle when it is drawn in the TS plane? 
What is the physical interpretation of the area? To what aspect of the figure is 
the efficiency related ? 


II Thermodynamic potentials 


The last few examples of the previous chapter illustrate the useful type of 
relation which can be obtained simply from the knowledge that dS, for 
example, is an exact differential. On the other hand, it is also evident that 
a certain amount of skill is required to be able to choose the appropriate 
independent variables as well as exactly how to proceed in differentiation. 
Fortunately, these methods can be systematized in such a way that the 
more significant and useful relations can be selected without too much 
difficulty from the extremely large number of possible ones. 


11-1 The potentials and their natural variables 


For a homogeneous system possessing one mechanical and one thermal 
degree of freedom, we have found that there are two pairs of conjugate 
variables (7, S) and (—p, V); each pair consists of one extensive variable 
(S, V) and its conjugate intensive variable (7, —p). The mechanical 
intensive variable is —p because the first law has the form 


dU = TdS — pdV (11-1) 


This differential form indicates that it is natural to regard the extensive 
variable U as a function of the other extensive variables so that 
U = U(S, V). Since we would then have 


=| (22) 
dU = {|—] dS —1} dV 11-2 
(= V * OV ls ( ) 


we see, on comparing with (11-1), that the intensive variables would be 
given as functions of the extensive variables by 


a 


which are the same results we found first for the ideal gas in (9-45). 


Part Two. Thermodynamics 101 


The last equations show us that in a sense we can call U a “potential” 
by analogy with the mechanical case, because the variables conjugate to 
the independent variables can be simply obtained from the function U by 
differentiation with respect to the corresponding independent variable. 
However, the selection of the independent variables may actually be 
determined primarily by experimental convenience; if we want to use 
pairs of independent variables consisting of one thermal variable and one 
mechanical variable per pair, then for a system of constant composition 
there is a total of four possibilities from which to make our choice: 
(S, V); (S, p); (4 V); (7, p). We know that U is the potential for which 
(S, V) are the natural independent variables, and we want to find the 
potentials appropriate to the other pairs of variables. Since we want to 
replace an original variable by its conjugate as the new independent 
variable, the appropriate method is to form the necessary Legendre 
transformations of the energy according to the rule discussed in Sec. 7-2 
and given by (7-13), that is, by subtracting the product of the two con- 
jugate variables whose role we wish to interchange. Thus we can construct 
the following scheme for changing the variable pairs: 


S,V S, p —pV 
from | S,V | to |7,V1|, subtract TS 
S, V T, p —pV+T7S 


In this way, starting from U(S, V), we can form three more potentials 
which are functions of the indicated variables: 


H(S, p) = U + pV = enthalpy (11-4a) 
F(T, V) = U — TS = Helmholtz function (11-45) 
G(T, p) = U + pV — TS = Gibbs function (11-4c) 


The function F is sometimes called the free energy; G is also known as 
the Gibbs free energy. 

We can verify that the listed variables are actually the independent ones 
by calculating the differentials of (11-4) and using (11-1); the results are 
found to be 

dH = TdS + V dp (11-5a) 


dF = —S dT — pdV (11-5b) 
dG = —SdT + Vadp (11-5c) 
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Using the same method we used to obtain (11-3) from (11-1), we find at 
once from (11-5) that 


G8), Gas 


We also see that the right-hand equations of (11-65) and (11-6c) are 
actually equations of state. These relations in which the conjugate vari- 
ables are derived by differentiation, as in (11-3), are partial justification 
for also giving the name of potential to the functions of (11-4). We can 
now proceed further and obtain some remarkable and useful relations 
from these results. 


11-2 Maxwell’s reciprocal relations 


These relations are simply expressions of the equality of mixed second 
partial derivatives. For example, from (11-3) and 


eu aU 
@S@V aVvas 
we find that 
(=) 7 - (2) (11-7a) 
OV /s 0S /v 


Similarly, using the equations of (11-6) to form 0?H/dS dp, 0?F/0T dV, 
and 0°G/0T dp, we find that 


ty] 7 Fal (11-76) 
(, 7 ($2), (11-7c) 
(5), () (11-7d) 


We recall that in (10-48) we had already given (11-7c), which we had 
obtained in a different manner. 

The principal use of these reciprocal relations is to express certain 
derivatives in terms of more easily accessible quantities, as we shall see in 
examples to follow. The equations (11-7) are quite difficult to memorize 
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and, when they are needed, it is generally much easier and more accurate 
to start afresh and derive them as needed from the definitions of the 
various potentials given by (11-4). 


11-3 Use of the potentials to characterize thermodynamic 
equilibrium 


In mechanics the condition of stable equilibrium can be characterized 
as corresponding to a minimum in the potential energy. Since we have 
been describing these new functions (11-4) as “‘potentials,”’ it should not 
be too surprising to find that extreme values of them also correspond to 
equilibrium situations. We can begin this investigation by restating our 
basic result that the entropy of an isolated system never decreases. By 
definition, an isolated system cannot interchange heat or work with its 
surroundings; hence we know that U and V will both be constant. Since 
the system will be in equilibrium when no more changes occur, and since 
all changes in this isolated system will tend to increase its entropy, equilib- 
rium will correspond to a maximum value of the entropy. Thus we can 
state our fundamental condition for equilibrium for an isolated system as 


S; = maximum, U,=const., V, = const. (11-8) 


In practice, however, almost never are we interested in nor do we deal 
with a strictly isolated system, because the situations we want to discuss 
involve systems which are not isolated but which are free to interact with 
their surroundings, that is, with the rest of the laboratory, the city, the 
atmosphere, and in principle with the whole universe, although many of 
these interactions are really quite small. Therefore it is natural to ask 
whether it is possible to discuss the equilibrium of a general system, which 
is free to interact with its surroundings, solely in terms of a function or 
functions characteristic only of the system itself and properly chosen so 
as to take into account the behavior of the surroundings which generally 
are of an indeterminate and unspecifiable nature. However, we can 
obviously take the system of interest plus its surroundings as a large isolated 
system to which (11-8) applies. In order to accomplish this in a quanti- 
tative manner, it is customary in thermodynamics to replace the rather 
indefinite surroundings by the definite concept of a heat reservoir, which is 
defined as a very large system at the temperature JT and whose heat 
capacity is assumed to be so large that its temperature remains constant 
no matter how much heat is added to or taken from it. Thus we take the 
system of interest plus the heat reservoir as our isolated system, as 
illustrated schematically in Fig. 11-1. 
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Heat isolated 
een + reservoir system 
(S) (Sres) (Si) 


Fig. 11-1 


Since S; = S + S,,,, the fundamental condition that the entropy of an 
isolated system can only increase becomes 


dS + dS 


res 


>0 (11-9) 


We can easily calculate the entropy change of the reservoir because the 
only heat gained by the reservoir is due to an equal heat loss from the 
system of interest, so that 


dSre, = @Qres _ _42_ _ dU + pdV 


= — —__*— 11-10 
T T T ( 

and therefore (11-9) can be written as dS > (dU + p dV)/T, or 
TdS > dU + pdV (11-11) 


which is actually the fundamental statement (11-9) but now involves only 
quantities referring to the system of interest because the effect of the 
reservoir has been taken into account. We now apply (11-11) to some 
more specialized processes. 


Processes at constant entropy and volume 


If dS = 0 and dV = 0, we see from (11-11) that 
dU <0 (11-12) 


Thus the energy can only decrease in this kind of process, and the equi- 
librium state can be characterized by 


U=minimum, S=const., V = const. (11-13) 


In this sense, we see that the energy U behaves somewhat like mechanical 
potential energy. Although this result (11-13) is of interest as an alterna- 
tive way of stating (11-8), it is not of as practical importance as the results 
we Shall obtain next. 
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Processes at constant temperature 


If 7 = const., we can use (11-45) and (8-6) to write (11-11) as 
02>dU —TS)+pdV=dF+pdV=dF—dw (11-14) 
If we rewrite this inequality in the form 
—dW < —dF (11-15) 


we see that the decrease in the free energy is the maximum external work 
—dW which can be done by the system during an isothermal process; 
this is the historical basis for the use of the name “free energy’’ for the 
function F. 


Processes at constant temperature and volume 


If dT = 0 and dV = 0, then (11-14) shows that 
dF <0 (11-16) 


Since F can only decrease for these processes, equilibrium must correspond 
to the minimum value of F, or 


F = minimum, Y=const., V = const. (11-17) 


Processes at constant temperature and pressure 


If dT = 0 and dp = 0, we can use (11-4c) to write (11-14) as 
0>d(U — TS + pV) =dG (11-18) 


Since G can only decrease for these processes, equilibrium must corre- 
spond to the minimum value of G, or 


G=minimum, J =const., p= const. (11-19) 


Hence, as has already been indicated several times, the functions we 
have defined in this chapter do have many of the properties we usually 
associate with the term “potential.” Since most laboratory experiments 
are most easily performed under constant pressure conditions (such as 
that of the atmosphere), it seems clear that the Gibbs function G is of 
greatest practical importance. However, as we shall see in our discussion 
of statistical mechanics, it is the Helmholtz function F which is the most 
easily calculated. 


106 = Introductory Topics in Theoretical Physics 


Exercises 
11-1. Use (9-21) and (11-5a) to show that 


C, = T(aS/aT), (11-20) 
11-2. Show that 
T dS = C, dT + T(ap/aT)y dV 
= C,dT —T(aV/aT), dp 
= C, (aT/aV), dV + Cy (8T/@p)y dp 


11-3. If the adiabatic compressibility is defined by kg = —V~1(@V/ap)g, 
show that xp/kg = C,/Cy. 

11-4. Show that the enthalpy can only decrease for processes at constant 
entropy and pressure, and thereby find the corresponding equilibrium condition. 


I2 Real gases 


Before we go on to consider more of the general aspects of thermodynamics, 
it is desirable to apply the results we have obtained so far to the discussion 
of the properties of a specific thermodynamic system—real gases. Up to 
now we have used only the ideal gas approximation, and we begin by 
considering another approximation to the equation of state of gases which 
is due to van der Waals. Although this historically important equation of 
state is not adequate to describe all real gases with complete accuracy, it is 
sufficiently accurate and simple enough to illustrate quite well many of the 
qualitative differences between real and ideal gases. 


12-1 The van der Waals equation of state 


Although the use of specific ideas associated with the molecular structure 
of gases is somewhat foreign to the macroscopic, empirical methods of 
thermodynamics, nevertheless we shall briefly recount the type of reasoning 
used by van der Waals in devising his famous equation. In principle, we 
cannot actually expect the equation of state (8-21) to hold for real gases 
because of the neglect of two factors: first, because the gas molecules 
presumably have a finite volume, the equation of state should reflect the 
fact that the actual volume available for the molecules to move around in 
should be less than the volume V of the container which appears in the 
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equation of state; second, since there are undoubtedly forces of some sort 
between the molecules, the effective pressure within the system may be 
different from the observed pressure p which is also used in the equations. 

Evidently we have to subtract the effective volume occupied by the 
molecules from the observed volume V. We can obtain an idea of this 
correction factor by considering a collision between two of the molecules 
which for simplicity are assumed to be spherical; the contact between 
them at collision is illustrated in Fig. 12-1. We see that, as far as the 
volume is concerned, the effect is the same as a collision between a point 
molecule and one of radius 2R and therefore involves a volume 


477(2R)? = 8(47R*) = 8 x (volume of one molecule) 


Taking an average over the pair, we can conclude that, as far as collisions 
are concerned, the effective volume per molecule is four times the actual 
volume of one molecule. Therefore, if we multiply this by the total 
number of molecules and subtract the result from the total volume, the 
effective volume available for the whole gas can be written as 


Vig = V— vb (12-1) 


where 6 is a constant which we can interpret as four times the volume of a 
kilomole of molecules. We shall reconsider this problem in our later 
discussion of statistical mechanics and shall find that our interpretation of 
b is substantially correct, in spite of the crude way in which we evaluated it 
above. 
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Fig. 12-1 
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Fig. 12-2 


The intermolecular attractive forces which act on a molecule in the 
interior of the gas will cancel out on the average because the forces are 
equal in all directions, as illustrated in Fig. 12-2a. However, this will not 
generally be true for a molecule near the wall, as shown in Fig. 12-26, and 
the net effect will be a force tending to draw the molecule back into the 
interior. Hence we can expect that the observed pressure p exerted by the 
gas on the.walls will be somewhat less than the effective pressure Pog in 
the interior, so that we should write pg = p + p’. The force acting ona 
given molecule is proportional to the number of interactions it has with 
other molecules and hence is proportional to the number per unit volume 
and thus to »/V. The pressure correction is also proportional to the 
number of molecules colliding with the wall and thus to »/V; therefore the 
total correction to the pressure, p’, is proportional to (v/V)?, and we can 
write 


Pett = p+—-;> (12-2) 


where a is a proportionality constant characteristic of the gas. 

Since Boyle’s law (8-14) for ideal gases stated that pV = const., we 
would now expect that in order to apply this basic result to real gases we 
should replace it by 


Perr Ver = Const. (12-3) 
and therefore by P 


(> + “Nu - vb) = const. (12-4) 
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If we were now to follow the same line of reasoning we used in Sec. 8-5 to 
derive the ideal gas equation of state after obtaining Boyle’s law, we 
would find from (12-4) that 


(> |. 2) — vb) = vR'T (12-5) 


where R’ is a constant which depends on the particular gas involved. 
For simplicity, we usually replace R’ by the universal gas constant R and 
approximate (12-5) by 


(> + “au - vb) = vRT (12-6) 


which is the van der Waals equation of state. 
It is sometimes more convenient to write (12-6) in the form 


2 
pao 28 (12-7) 


V—vb vy? 
and we see from this form that, as V — oo, p > vRT/V, showing that van 
der Waals’ equation reduces to the ideal gas equation in this limit. [We 
can equally well obtain this result by setting a = 6 = 0 in (12-6) or (12-7).] 


12-2 Some properties of a van der Waals gas 


Now that we have the equation of state, we can go on to consider some 
of the ways in which a real gas differs from an ideal gas by using the van 
der Waals gas as a model. We begin by finding the coefficient of thermal 
expansion, @. 

If we calculate dp from (12-7) and set it equal to zero, we find that 

vR dT vRTdV , 2vadV 
V—vb (V— vb) y° 
and, since dV = (0V/0T), aT in this case, we obtain 
«= 1(%) a 22 (12-8) 
V\OT/p ~=VT— [2va(V — vb)/RV’] 


which becomes «a; = 1/T when a = b = 0 and is exactly the value for an 
ideal gas given by (8-16). Therefore the deviation of « from the ideal gas 
value is given by 


_ [2va(V — vb)?/RTV?] — vb (12-9) 


! 
T  VT— [2va(V — vb)?/RV?] 


Aa—-a=a-— 
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If we assume that a and 5 are small enough that only their first powers 
need be included, we can approximate (12-9) by 


me (2va/RT) — vb _ (2a/RT) — b 


C—O, ie (12-10) 
VT — (2va/R) T(V |v) 


where, in effect, we have assumed that V/y > b and R7T(V/v) > 2a. 

It is found experimentally that 2a/RT > b for most gases in the ordinary 
temperature range so that « > «,; the only exceptions are hydrogen and 
the rare gases, for which the attractive forces between the molecules are so 
small that 2a/RT < b. 

We see from (12-10) that, for any van der Waals gas, there will be a 
temperature 

2a 
T, oe (12-11) 


called the inversion temperature for which « = a, so that the gas acts like 
an ideal gas, at least as far as its thermal expansion is concerned. If 
T > T,, then « < «,, and, if T < 7,, then « > «,. 

A convenient graphical way of representing some of the features of a 
van der Waals gas is to plot the isotherms on the pV plane as shown in 
Fig. 12-3; these curves are obtained by setting TJ = const. in (12-7). We 
note first of all that, as V— vb, p — oo so that the dashed line V = vb is 
one of the asymptotes and therefore V < vb has no meaning for this 


Fig. 12-3 
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equation of state. For large values of 7, the curves are similar to the 
rectangular hyperbolas pV = const., which are the isotherms of an ideal 
gas, but as TJ is made smaller the curves become quite different and 
eventually develop a maximum and a minimum. In this region, we see 
from the figure that an isobar p’ = const. intersects the isotherm at three 
different values of the volume; this can also be seen from (12-6) which is a 
cubic equation in V for constant p and TJ and therefore has three roots— 
one of which may be real or all three may be real. The isotherm which is 
the transition curve between these two types occurs for a temperature 7, 
called the critical temperature. The three points of intersection with an 
isobar coalesce into only one, and then the critical isotherm has a point of 
inflection and zero slope at this point, known as the critical point; the 
coordinates of this point on the pV plane are p, and V, and are called the 
critical pressure and critical volume, respectively. 

By using the facts just given, we can find the coordinates of the critical 
point. First, van der Waals’ equation (12-7) must be satisfied so that 


RT, va 
= - = 12-12 
me ee ae ere) 
Since the curve has zero slope at the critical point, we must have 
2 
(2) | eee aes (12-13) 
OV /rJcrit (V, — vb) V, 
The critical point is also a point of inflection, and we must have 
2 2 
(54) | = 2yRT,  _ 6v"a (12-14) 
OV? /rJcrit (V, — vb)* VA 
By eliminating T, from (12-13) and (12-14), we find that 
V, = 3vb (12-15) 
Substituting V, into (12-13), we find 
RS 2" (12-16) 
27b 
and, finally, we find from (12-12) that 
a 
“= 12-17 
Pe = Sapa (12-17) 
The last three results can be combined to give 
Polo 52 (12-18) 
vRT, 8 


in contrast to the ideal gas value pV/yRT = 1. 
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One can also solve for the constants characterizing the gas in terms of the 
measurable critical coordinates; the results are that 


2 
ee (*), ie 3p.(“), R = SPebe (12-19) 
3\y» y 3yT, 
If these are substituted in (12-6), the van der Waals equation becomes 
3p ‘) ( “) 8p.V.T 
os SELEY a ef, Cee 12-20 
(P y? 3 3T, ( 


If we now define the reduced variables by 


=-, w=-—-, T= 12-21 
. Pe Cc J i 
then (12-20) can also be written 
3 ; 8 
SMe iS 12-22 
(* : *) ( 3 3 : ( 


The last equation has no undetermined constants in it and thus says that, 
if the (p, V, T) values for all gases are expressed in terms of the reduced 
variables (12-21) as calculated from the critical coordinates for the given 
gas, the same equation of state (12-22) should hold for all real gases. In 
other words, van der Waals’ equation implies that the equation of state for 
any gas should be an equation involving only three constants. The 
existence of such an equation of state, which was first indicated by van der 
Waals’ equation, is known as the theorem of corresponding states. Accord- 
ingly, one can say that all gases which are in states described by the same 
values of 7, w, and 7 are in corresponding states regardless of how different 
their actual states as given by p, V, and T might be. It seems clear that 
such a theorem can be only approximately true and cannot hold for all 
cases, but it nevertheless proves to be of practical importance, not only in 
the study of gases but also for any general class of system which we can 
describe by reduced variables, such as the magnetic systems we shall 
consider later. 


12-3 State variables of a van der Waals gas 


We would expect the energy of a real gas to depend on the volume, 
because there are forces between the molecules and work must be done 
against these forces whenever the gas expands. We can calculate this 
effect rigorously for the van der Waals gas since we have its equation of 
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(2) = (2) eee a (12-23) 


OT/y y? 

This result agrees with the basic ideas of our derivation of the equation of 
state, since we saw in (12-2) that the pressure at the wall which is ascribed 
to intermolecular forces was —v*a/V?. The work done on the gas in a 
small isothermal expansion would then be dW = (7?a/V*) dV, so that the 
energy change would be dU = dW = (7*a/V*) dV, which leads at once 
to (12-23). 

We can use (12-23) to obtain an interesting property of the heat capacity 
C,. From (9-6) and (12-23), we see that 


(a7'),= av (ar),~ ar (av), 
avir av\aTly ar \av 


[E(G-9 eam 


Therefore C, is independent of the volume, and C, = C,(7) just as for an 
ideal gas. We can now write an explicit expression for the energy by 
using (9-3), (9-6), and (12-23) to obtain 


2 
dU =C,(T) dT + a dV (12-25) 
so that 


U = U(T,V) a CT’) dT’ — — ot U, (12-26) 


which, when a = 0, reduces to (9-15) as found for the ideal gas. 
We can now evaluate the difference in the heat capacities by using (9-9), 
(12-8), and (12-23) with the result that 


vR 


. 1 — [2va(V — vb)?/RTV"] 


= oe (12-27) 


Pp 


which becomes approximately 


2va(V — vb)* 


2va 
Cy — C= 9R| 1 + a |=or(i4+ 2 a) >vR_ (12-28) 


showing that for a real gas we can expect the difference in the heat capaci- 
ties to be greater than that found for an ideal gas. 


114 ‘Introductory Topics in Theoretical Physics 


We can now calculate the entropy by using (10-28), (12-25), and (12-6) 
to find that 


dS = C,(T) dT . vR dV (12-29) 
T V— vb 
and, on integrating (12-29), we obtain 
T ¢ , 5, 
s—s,=| CAT) aT Rin HOO (12-30) 
To T’ Vo __ vb 


which reduces to the ideal gas value (9-42) when b = 0. 

It is of interest to note that the constant a which describes the inter- 
molecular forces appears in the energy expression (12-26) but not in that 
for the entropy (12-30), while the opposite occurs for the constant b 
related to the molecular volumes. The reasons for these facts will become 
clearer when we consider gases by the methods of statistical mechanics. 


12-4 Joule-Fhomson effect 


Now that we have more powerful methods than were previously avail- 
able to us, we want to go back and discuss in more detail the Joule-Thomson 
effect which is a name used to describe the results of the porous plug 
experiment which we first discussed in Sec. 9-6; this process is also called a 
throttling process. 

We have seen in (9-37) that this process is isenthalpic, that is, it can be 
characterized as one of constant enthalpy; this was shown to hold in 
general, regardless of the type of fluid that may be used. It will be well to 
review the way in which the experiment is actually performed. The 
initial values of the pressure p; and temperature 6, (measured on some 
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\<-Inversion curve 


—f—_ 


/ 
Cooling rs Heating 


Fig. 12-5 


convenient empirical scale) are arbitrarily chosen; then, as indicated in 
Fig. 9-4, the final pressure p, is also chosen arbitrarily, the experiment is 
performed, and the final temperature 0, is measured. The net result is 
that for each pair of initial values (p,, 6;), we can obtain a set of pairs of 
values (p,, 9,) all of which correspond to the same enthalpy. If these values 
are plotted on a 6p diagram, the result is a curve like that shown in Fig. 
12-4. Since all the points correspond to the same enthalpy of the gas, the 
curve is called isenthalpic. It does not, however, represent the actual 
throttling process which is irreversible and therefore cannot be represented 
as a sequence of equilibrium states. 

If one now chooses a new initial set (p,, 9;), the whole process can be 
repeated and another isenthalpic curve corresponding to a different value 
of the enthalpy can be obtained. The set of isenthalpic curves obtained in 
this way has the general appearance shown in Fig. 12-5. Since the pressure 
is reduced in this process, the designations “cooling” and “heating” 
applied to the regions shown are quite appropriate. The numerical value 
of the slope of an isenthalpic curve is given by 


= (=) = Joule-Thomson coefficient (12-31) 
Op/H 

The locus of all points for which yu’ = 0 is called the inversion curve and is 

shown as the dashed line in Fig. 12-5. When the absolute temperature T is 
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used, the Joule-Thomson coefficient is written as yu: 


a= (<7) (12-32) 
Op /H 
We now want to relate u to other properties of the system. If we write 
S = S(T, p) so that 


0S 0S C,dT OV 
dS = (25) dT + (5) dp = —— — (*) d 12-33 
aT, ao ee ae 
because of (11-20) and (11-7d), and then substitute (12-33) into (11-5a), we 
find that 
OV 
dH =C,dT + lv — (2) dp (12-34) 
OT/» 
which now involves the variables 7 and p needed to evaluate (12-32). In an 
isenthalpic process, 
dH =0, dT = (27) dp (12-35) 
Op /H 
if we treat 7 as a function of p and H. If (12-35) is substituted into (12-34) 
and (12-32) is used, we find that 
BLE -y] eo 
Op H C, OT p 
so that we nave succeeded in relating the Joule-Thomson coefficient to the 


equation of state. We can also write (12-36) in terms of the coefficient of 
thermal expansion by using (8-2); the result is that 


w= 7 (6 _ 4 _ VT@ —4,) (12-37) 
C T C 


if (8-16) is also used. We see at once that uw = 0 for an ideal gas; this 
means that the temperature of an ideal gas does not change during a 
porous plug experiment. We have already discussed this result in Sec 9-6. 


D 


Example. van der Waals Gas. If we substitute (12-10) and (12-11) into 
(12-37), we find that 


px *(4 4) = 2(2_ (12-38) 


in terms of the inversion temperature 7;. Therefore, if T< 7,, 4 > 0 
and the gas will cool on expansion, whereas it will warm up if its initial 
temperature is greater than the inversion temperature. In the case of 
air, there is considerable cooling even at 0°C, and this effect forms the 
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basis of the Linde process of liquefying air by repeated expansions. 
Hydrogen, on the other hand, is above its inversion temperature even at 
0°C, so that it will warm up when expanded; in order to liquefy hydrogen 
by repeated expansion, it is first necessary to cool it below its inversion 
temperature of about — 80°C. 


12-5 Measurement of the absolute temperature of the ice point 


Another important practical result of the last considerations 1s the 
experimental determination of the quantitative relation between the 
absolute temperature and an empirical temperature. The initial basis we 
used for defining the absolute temperature was the equation of state of an 
ideal gas as given by (8-21). We also defined the absolute temperature T in 
terms of the heats absorbed and rejected by a reversible Carnot engine as 
given in (10-15). We then showed in (10-24) that the ideal gas temperature 
scale and the absolute temperature scale are identical. 

However, we do not actually have ideal gases available for our use; 
consequently, what we really need to know is how the temperature 0 as 
measured by a real gas thermometer can be related to the absolute temper- 
ature T. Since the Celsius scale and the absolute scale are chosen so as to 
have the same size unit, we see from the connection between them as 
given by (8-17) that the essential number for which an accurate measure- 
ment is needed is J, (the ice point), that is, the absolute temperature 
corresponding to 0°C. Although (10-15) could be used in principle, it 
would not be very satisfactory because calorimetric measurements are not 
sufficiently accurate. Since (10-14) and (10-15) form the basis for introduc- 
ing J into the second law, any relation involving T which is obtained from 
the second law can equally well be used for this purpose, and it has been 
found that the Joule-Thomson effect is particularly suitable. 

Our starting point will be (12-36), but, since all quantities used are 
measured in terms of the empirical scale 0, we must transform (12-36) to 
take this into account. From (10-14), (12-31), and (9-21), we find that 


«= (a). (ie) (Spl = # (ig) 2 


c=) ce) oom 


where C,,’ is the experimentally determined heat capacity. We also have 


(52), = (oo), (an) 
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and, when these results are substituted into (12-36), we find that 


OV (=) 
Pi) f— SV ae’ CZ 
(x) dT a oe 


dT _ (@V/a6), dO 


=e (12-41) 
T V+u'C,’ 


and therefore 


which now has only measurable quantities on the right side. If we limit 
ourselves to values of yu’ and V all at one definite pressure, (0V/00),, dO = 
dV and (12-41) can be written 


jee ee (12-42) 


If we integrate between the ice point (0°C) and the steam point (100°C) 
whose absolute temperatures are J, and 7, = JT, + 100, respectively, we 


find that 
Vs 
eset (1 i: 100) -| Lae (12-43) 
T Tp ; VV+twC,’ 


where V, and V, are the measured volumes of the gas at the two tempera- 
tures. Foran ideal gas, wu’ = 0 and (12-43) leads directly to T,/T) = V,/Vo, 
which is seen to be Charles’ law (8-15) when we recall that all quantities 
involved correspond to the same pressure. 

This result (12-43) expresses TJ, completely in terms of quantities 
measured by the use of real gases, and thus our problem is solved in 
principle. This is the way in which the accurate value 7) = 273.15 degrees 
given by (8-18) was actually obtained. 


Exercises 


12-1. Assume that C, for a van der Waals gas is constant and show that the 
equation of a reversible adiabatic process is T(V — vb)’R/Ce = const., while 
the temperature change in a free expansion is given by T, — T; = » 2a(V; — V,)/ 
CVV ;. 

12- 2. Show that (@C,/@p)~ = —T(02V/eT*), and apply to a van der Waals 

as. 

12-3. Find the critical variables p,, T,, and V, for a gas which satisfies the 
Dieterici equation of state: 


PV — vb) = yRTe—valRTV 


12-4. Calculate the Joule-Thomson coefficient » for a gas obeying the Dieterici 
equation of state. 
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I3 The third law of thermodynamics 


The third law is different in character from the other three laws we have 
discussed because it does not postulate the existence of a particular state 
variable but, instead, makes a definite statement about the numerical 
values of functions we have already defined. We recall that in (10-27) the 
entropy was defined only in terms of its differential dS, so that S itself is 
defined only up to a constant S,. As long as our applications of thermo- 
dynamics require only the use of entropy differences (as in evaluating 
derivatives), the unknown value of S,does not present a problem; however, 
the potentials F and G involve a term TJS in their definitions, so that there 
is also an undefined quantity 7S, involved in them, and there are situations 
where a knowledge of their absolute values is necessary. The third law 
enables us to make a definite statement about the absolute value of S; 
it was first proposed by Nernst, although we shall postulate it in a form 
which was given later by Planck. 


THIRD LAW OF THERMODYNAMICS. The entropy of any system 
vanishes as the absolute temperature approaches zero; that is, 


lim S=0 (13-1) 


T-0 


As Nernst initially formulated his statement, it related only to the 
change AS = S, — S,; of the entropy of any system in an isothermal 
process connecting the initial and final states i and f, and he said that 


lim AS = 0 (13-2) 


T-0 


This less general statement says only that the entropy of any system 
approaches a definite limiting value S, which is completely independent of 
the nature of the state at T = 0. We can then say that, since this value S, 
is universal and unobservable, then for all practical purposes it is zero and 
we can simply set S, = 0 to obtain (13-1). We shall also see that we can 
obtain (13-1) by a simple extension of the reasoning leading to (13-2). In 
addition, the zero value for S as T — 0 is the natural value indicated by 
Statistical mechanics. We shall first indicate how Nernst was led to his 
result, and then we shall consider some of the experimental consequences 
of (13-1). 
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13-1 Derivation of Nernst’s result 


Nernst was led to postulate (13-2) by his attempts to account for the 
principle of Thomsen and Berthelot. This principle is an approximate 
empirical rule which is useful in chemistry for predicting the final equi- 
librium state of a chemical reaction. Basically, the rule says that, if we 
consider processes at constant J and p, the final state will be the one which 
corresponds to the release of the most heat. We can state this differently if 
we recall our result (9-18) in which we showed that the heat added to a 
system at constant pressure equals the increase in the enthalpy; therefore, 
in this case we have 

Heat given off = H; — H, (13-3) 


According to the principle of Thomsen and Berthelot, the final equilibrium 
corresponds to (H; — Hy)max OF (—A;)max since the initial state is fixed or, 
finally, to 

H, = minimum (13-4) 


In other words, the principle is equivalent to the rule that equilibrium at 
constant T and p corresponds to a minimum of the enthalpy. On the other 
hand, we know from (11-19) that the exact condition requires that the 
Gibbs function G be a minimum; the basic problem is therefore to deter- 
mine how these two statements could be even approximately equivalent. 

Experience also showed that the rule (13-4) is useful only if the tempera- 
ture is not too high, and, in fact, the lower the temperature, the better the 
rule. By extrapolating this result, we are led to make the assumption that 
the rule (13-4) is exact at T = 0. Since G = H — TS, the changes occur- 
ring in the process are related by 


AG = AH — TAS (13-5) 
which shows that if we assume 


lim AH = lim AG (13-6) 
T 0 T 0 
we can conclude that AS is bounded at T = 0. If we try to determine the 
limiting value of AS by writing (13-5) as 
_ AH —AG 
r 


AS (13-7) 


and letting T — 0, we find that the expression approaches 0/0 and hence is 
indeterminate because of (13-6). If we try to evaluate this indeterminate 
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expression by the usual rule of separately differentiating the numerator and 
denominator and then going to the limit, we find from (13-7) that 


lim AS = (22) — (228) (13-8) 
T 0 dT /T=0 dT /r=0 


However, the approximate equality AH = AG holds not only at T= 0 
according to (13-6) but even when T is quite a bit different from zero; 
in other words, the two curves must almost coincide over a fairly large 
range of T; this is possible only if their slopes are equal as indicated in 
Fig. 13-1. Thus we have 


l ee ef pleat c 
im ( ) im ( ; (13-9) 


so that we find from (13-8) that AS 0 as T— 0, exactly as given by 
(13-2). 

Planck extended Nernst’s postulates (13-6) and (13-9) to hold for the 
functions themselves as well as their differences; thus we can also write 


lim G= lim H (13-10) 
T~0 T~0 
. (0G . (0H 
in (2) = tn (22) 1 
T-0 \OT/> T-0 \OT/p 
If we use (11-4a), (11-4c), and (11-6c), we can write 
0G 
G=H+ 7() 13-12 
OT!» 
so that 
< | G—H 
(a D T ( 
AG 
AH 
T 


Fig. 13-1 
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If we look at the limit of this equation as T — 0, we again get the indeter- 
minate form 0/0; if we then evaluate the limit as we did to obtain (13-8), 
we find from (13-13) and (13-11) that 


. (0G . (0G . (0H 
os (5), es (5), ne 4) : ae 
and therefore, since S = —(0G/0T), by (11-6c), we find that S—0 as 
T — 0, showing that the Nernst-Planck postulates lead to the form of the 
third law we originally stated in (13-1). We also see from (13-14) and (13-9) 
that the initial slope of the AG (and AH) curve should be zero as shown in 
Fig. 13-1. 

In connection with the third law, it should be emphasized that thermo- 
dynamics deals only with equilibrium situations and tells us nothing about 
the time required for equilibrium to be attained. Since atomic and 
molecular motions slow down considerably as the temperature is decreased 
and internal processes therefore generally proceed much more slowly, it 
may well take an extremely long time for the system to get into equilibrium 
in the state corresponding to zero entropy. Hence, for all practical 
purposes, this state may never be actually attained because one cannot 
wait that long and the net effect would be that one would be dealing with a 
real system at JT ~ 0 whose entropy is actually different from zero since 
the system is only in a quasi-equilibrium state. 


13-2 Some consequences of the third law 


In this section we want to consider some of the ways in which the third 
law can be checked experimentally because of the specific predictions it 
makes about certain properties of a system. 

We begin with the reciprocal relation (11-7d): 


@)-- o 


Since S approaches a limit which is independent of p as T— 0, then 
(0S/dp) 7 — 0 as T— 0, so that 


F OV 
lim (=) = 0 13-16 
T-0 OT ys) ( ) 
and therefore 
lim a=0O (13-17) 
T=70 


because of (8-2). In other words, the coefficient of thermal expansion of 
any system vanishes at absolute zero. 
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Similarly, by starting with (0S/0V)7 = (dp/0T)p as given by (11-7c), we 
can conclude that (dp/0T), — 0 as T — 0, and therefore that 


lim B =0 (13-18) 
T-0 


because of (8-2). 

We have seen in (10-44) that C, = 7(0S/0T), so that, if we calculate the 
entropy of a given state (7, V) by integrating with respect to temperature 
at constant volume and using (13-1), we obtain 


T ’ ’ 
S(T, V) -| C,(T") dT" 
0 TT 


In order that we get a finite value of the entropy, the integral must converge 
at the lower limit, and we must have 


lim C, =0 (13-20) 


T-90 


(13-19) 


Similarly, if we begin with (11-20), we find 
lim C, =0 (13-21) 


T 0 
In the same way, we can conclude in general that the heat capacities must 
vanish at absolute zero. This important consequence of the third law led 
to a series of experiments which were done by Nernst and his students to 
measure heat capacities at low temperature. The relations (13-20) and 
(13-21) have been amply confirmed by experiment and therefore provide 
evidence for the basic correctness of (13-1). 

One can carry out the same type of analysis for other quantities by 
using the Maxwell relations obtained from the various possible Legendre 
transformations of the energy. The general result is that a// the temperature 
coefficients of the intensive and extensive parameters vanish at absolute 
zero, although we have seen only the examples (13-14), (13-17), (13-18), 
(13-20), and (13-21). 


13-3 ‘‘Unattainability’’ of absolute zero 


The third law can also be stated in a negative sense, just as the first and 
second laws could. The usual way of doing this is to say that absolute zero 
is a temperature which actually can never be attained but can only be 
approached asymptotically. One can show this to be a consequence of 
Nernst’s postulate (13-2) by considering any general adiabatic process, but 
we Shall do it in a specific way by discussing the only feasible way of 
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Fig. 13-2 


reaching very low temperatures—the method of adiabatic demagnetiza- 
tion. 

We shall discuss magnetic materials in detail in the next chapter, but 
for our present purposes we only need the general result that the entropy 
of an unmagnetized material, So, is greater than the entropy S,, when it is 
in the magnetized state. These entropies are shown qualitatively as 
functions of temperature in Fig. 13-2 which is drawn in accordance with 
(13-1). Suppose we were to start with the magnetized system at tempera- 
ture 7, and demagnetize it adiabatically, for example, by simply switching 
off the external field. Since dQ = 0, the entropy will be constant according 
to (10-27) and the process will take place along the horizontal line 1 — 2 
whose intersection with the curve S, determines the final temperature 7,, 
which is seen to be less than 7;. We can now magnetize the system 
isothermally along the path 2 — 3; the net result is that our magnetized 
system has been cooled from 7; to 7; We can repeat the adiabatic 
demagnetization along 3 — 4 and lower the temperature to 7,’. We see 
from these considerations, however, that, no matter how many steps like 
this we may make, we can never actually attain T = 0, although we can 
get arbitrarily close to it in principle. The closeness of the approach to 
T = 0 is seen to be determined by the steepness of the curve for Sp. 


Exercises 


13-1. Is it possible for a system to be described by the ideal gas equation of 
state all the way down to T = 0? by the van der Waals equation? 

13-2. Redraw Fig. 13-2 in a form which violates (13-1), and show that then 
it would be possible to reach absolute zero in a finite number of steps. 
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14 Magnetic systems 


In this chapter we want to consider the thermodynamic properties of a 
system which requires variables in addition to the pressure and volume 
for the specification of its state. Although there are many possible types 
we could discuss, we shall confine ourselves to the important case of 
magnetic systems. 

The first problem is to express the energy in terms of appropriate 
variables. In a previous discussion we found the energy density associated 
with a magnetic field to be given by (I: 26-16) as u,, = B?/2y for a linear, 
isotropic, homogeneous magnetic material of permeability u. If we change 
the fields slightly, the increase in the energy density will be 


i= 4b =a (14-1) 
pe 
since all the magnetic vectors are parallel in an isotropic material. We 
also know from (I: 19-59) that we can write B = yu (H + M), where M is 
the magnetization, that is, the magnetic dipole moment per unit volume; 
when this expression for B is inserted into (14-1) we find 


du, = oli dH + uH dM (14-2) 


In all the cases we shall consider, the magnetization of the material will be 
so small that H can be taken as approximately equal to the external field. 
The first term of (14-2) will always be present even in the absence of the 
material; hence it can be dropped from consideration since we are 
primarily interested in the thermodynamic properties of the matter. 
Accordingly, we can take our expression for the increment of magnetic 
energy density as simply 


dun, = oll dM (14-3) 
If we let 4 be the total magnetic moment of the system, where 
MM =VM (14-4) 
we find from (14-3) that 
dU, = boll dM (14-5) 


If we now augment the expression for the conservation of energy (11-1) by 
(14-5), we obtain 

dU=TdS—pdV+pyHda (14-6) 
so that 

TdS =dU+ pdV —wHdt (14-7) 
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Since our aim is to concentrate on the magnetic effects, and since the 
volume change associated with a change in magnetization is negligible 
unless we consider magnetostrictive materials, we can neglect the pdV 
term in (14-7) and simply use 


TdS =dU — w,HdM (14-8) 


as our basic equation. Although our derivation has been somewhat 
oversimplified, the final result is correct; we can see that it has the proper 
qualitative form since (14-5) is written as the product of the intensive 
variable (or “generalized force’’ “4H acting on the system) and the differ- 
ential of the extensive variable (or “generalized displacement’’ d.@). 
We also see that the signs in (14-8) agree with the curves of Fig. 13-2. 

Now that we have (14-8), we are able to investigate the properties of 
magnetic systems in a manner similar to that previously used for fluid 
systems. For example, if we regard both S and U as functions of T and 
M , (14-8) can be written 


1 (aU 1[/au 
Sa) gre ah eae . 
‘ = (4 +7 (3) - |¢ 7) 


If we now apply our condition for an exact differential (7-22) to (14-9), we 


obtain , 
a tlsrlel attr aale ~ **I] 
rere a ee ee es 
0M LT \oT/" OT\T L\0.4/r 


from which we find that 


(2) = bol H = (=) | (14-10) 


which is similar to (10-46) and can be evaluated to give us the dependence 
of the energy on the magnetization once we know the magnetic equation of 
state: £= MT, H)orH= HT, 4). 

Just as we could have defined an ideal gas as one for which the energy 
was independent of the volume, that is, (@U/0V)7 = 0, so can we define an 
ideal magnetic material as one for which 


(=) = 0 (14-11) 


so that the energy depends only on the temperature. We see from (14-10) 
that this would require T(0H/0T) , = H, or 


e),-2 oe 
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We can satisfy (14-12) by writing H = h(.4)T, where h(.@) is an arbitrary 
function of 4. If we solve the last expression for 4, we find (14-11) to be 
equivalent to ¥ 
M = m(S4) (14-13) 
ae 
where m(zx) is some function of the variable x and @ is an appropriate 
dimensional constant; in other words, the magnetic moment of an ideal 
magnetic material is a function only of the ratio H/T. The simplest form 
to assume is a linear one: m(x) = x; (14-13) then becomes 
Met (14-14) 
T 
This equation of state is known as Curie’s law and is found experimentally 
to be applicable to many paramagnetic materials where @ is a constant 
characteristic of the material. We shall find in a later chapter that ferro- 
magnetic materials do not have a simple equation of state of the form 
(14-13) and therefore cannot be characterized as ideal magnetic materials. 
We note too that an ideal magnetic material cannot exist as such all the 
way down to absolute Zero, since we can find from (14-13) that 


Of) 2 (Fo), -- Eons 


which leads to (0.4/0T),, — 00 as T — 0 in contradiction to the third law. 
Thus, although (14-13) may be found to be an adequate description for a 
given material in a given temperature region, we know from the last 
result that, as the temperature is lowered, we shall eventually need another 
equation of state to describe the material properly. 
If we now substitute (14-10) into (14-9), we obtain 
dU OH 
TdS = (=) dT — oT (=) dM 14-16 

OT a M0" Nat ha oo 
We can define a heat capacity at constant magnetic moment C_, and we 
see from (10-27) and (14-16) that 


chy oU 
Ca=T(—) =(— 14-1 
e (=), (Se ewe 


In order to define a heat capacity at constant field C;,, we need to express 
(14-16) as a function of Tand H. This can be done by writing the magnetic 
equation of state as. # = .4(T, H) so that 

OM OM 


dM = |—]| dT 24) dH 14-18 
(*), ~ (4 T ( ) 
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which, when substituted into (14-16), leads to 


0U 0H 0M 
TdS =|(—) —p,T(—) (=) | aT 
as= | (sr). “(sre (Gr), 4 


- wer(28), 24) an caw 


from which we find that 


os ay (=) (<<) 
Cy = Ti—) ={—) —2,T(— — 14-2 
a ($), (5 a Hla) Var}, 9 


and, if we use (14-17), we finally obtain 
ay (=) 
Cy — Ca = —MoT\—} |— 14-21 
ac: (sr) aT | eae 
As an example, let us consider a Curie law material described by (14-14) 
for which 


(#4) _ A (=) mer, (14-22) 
OT/u © OT /H T* 
Therefore (14-21) becomes 
\2 
Cy — Cu= lade = [by C () (14-23) 


and shows that Cy = Cy, when H = “@ =0, as would be expected 
since then the distinction between the two processes vanishes. 

Another application of these results concerns the temperature changes 
accompanying changes in magnetization and field to which we referred 
near the end of the last chapter. If we substitute (14-20) into (14-19) and 
use (7-10), we obtain 


ds = nat (5 


—] dH 14-24 
T OT ), : 
If we now consider an adiabatic process so that the entropy is constant, we 
find that the relation connecting the changes in temperature and field as 


obtained from (14-24) by setting dS = 0 is 


dT is (24) dH (14-25) 


and it describes the magneto-caloric effect. For virtually all materials, 
(0.4/0T),;, is negative and therefore dT has the same sign as dH. Thus a 
sudden increase in the field will result in a temperature rise, while a 
decrease in the field will produce a decrease in the temperature, exactly 
as we concluded from Fig. 13-2. 
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For the example of a Curie law material, if we substitute the second 
equation of (14-22) into (14-25), we find that 
TdT = (a2) dH (14-26) 
Cy 
If we integrate (14-26) for the case in which the field is decreased from Hy 
to zero, we find 


Ty 0 2 
{ Tat = H dH = \(T? — 73) = — eG Ho 
Ti H Ho 2CH 


and therefore the final temperature obtained in this adiabatic demagneti- 


zation is given by 
274 
T, = Ti = Hat (He) (14-27) 
Cy \T; 


This result shows us that the lowest final temperature will generally 
correspond to a large initial field and a low initial temperature. 

The magneto-caloric effect is the basis of the methods by which tempera- 
tures below 0.001°K have been attained. The principal difficulty with this 
method is that the only way in which the temperature can be measured is by 
measuring the susceptibility (4@/VH) of the material, and this requires a 
knowledge of the true dependence of the magnetization on the temperature 
which, as we have already indicated, is generally not so simple as that 
given by (14-14). 


Exercises 


14-1. Show that for a Curie law material the entropy is given by 
S = {Cn dT/T — py4*/2@ + const. 


and that (0C;,/@H)p, = 2u,¢H/T’. 

14-2. By starting from (14-8), define the magnetic analogs of enthalpy, 
Helmholtz function, and Gibbs function and derive all the equations correspond- 
ing to (11-5), (11-6), and (11-7). 

14-3. Show that the analog of (14-6) for an electric system is dU = TdS — 
pdV + Ed, where E is the electric field and # is the total electric dipole 
moment of the system. Also find the analog of a Curie law material and of 
(14-25). 
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IS Phase transitions 


Up to now we have assumed that our system has been completely homo- 
geneous throughout. Such a situation is not always an equilibrium 
possibility, however, and then it is observed that the system breaks up 
into two or more homogeneous portions called phases which are in mutual 
equilibrium. A common example is the transition of liquid water to ice, 
and a possible equilibrium situation of water is one in which the liquid 
and solid phases are coexistent. In this chapter we are concerned with the 
transitions between phases and their equilibrium with each other. 

Let us recall the behavior of a real gas below its critical temperature as 
predicted by the van der Waals equation and as illustrated by the iso- 
thermal depicted in Fig. 15-la. Suppose we were to try to compress the 
gas along this isothermal by keeping its temperature constant at the value 
T. We see from the figure that, as one would expect, the pressure increases 
as the volume is decreased until the maximum point M is reached. If we 
were to decrease the volume even more, then, according to the van der 
Waals equation, the result would be a decrease in the pressure. A situation 
such as this one is physically unstable and cannot actually be a charac- 
teristic of the equilibrium states; hence the isothermal shown in the figure 
cannot be completely correct. What is found to happen instead is that at 
a certain stage, indicated by v, the gas begins to liquefy even before the 
volume corresponding to the point of maximum pressure M is reached. 


Pp p 


Liquid 


Liquid + vapor 


() 
Fig. 15-1 
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If the volume is further decreased, the pressure remains constant at the 
value P as more liquid is condensed. This process continues until the 
point / is reached at which all the gas has been liquefied; from then on, it 
is necessary for the pressure to increase if the volume is to decrease 
further. 

In other words, the actual isotherm observed for a gas is like that shown 
in Fig. 15-16. If V > V,, the system is all in the single gas phase, while, for 
V < V,, it is all in the single liquid phase. At each point along the straight 
line at constant pressure P and V,; < V < V,, the system exists in the two 
phases in equilibrium with each other (the gas when in equilibrium with the 
liquid is generally called a vapor). The pressure P corresponding to the 
two-phase portion of the isotherm is called the vapor pressure. It is fairly 
evident that, if a different isotherm is used, the vapor pressure will be 
different; in other words, P is a function of T and independent of the 
volume. It is of interest to determine this function P(7) which is charac- 
teristic of the phase transition. 


15-1 The equilibrium condition and the Clausius-Clapeyron equation 


Since the equilibrium between the two phases corresponds to constant 
temperature and pressure, we recall from (11-19) that we must find the 
condition which minimizes the Gibbs function G. If there are a total of 
v kilomoles of material in the system of which », kilomoles are in phase 1 
and », kilomoles in phase 2 so that 


¥, + v2 = vy = const. (15-1) 
then we have 
G= V121 + MVePe = 121 + (vy — ”1)22 (15-2) 


in terms of the molar Gibbs functions g, and g, of the two phases. We also 
know from (11-5c) that g, and g. are functions only of T and P. We can 
minimize G by differentiating (15-2) with respect to the variable », and 
setting the derivative equal to zero: 


ue 


= _— = 0 
= a 


and therefore 
£1 = £2 (15-3) 


is the condition for equilibrium between phases so that there is no transfer 
of matter between them. 
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Rather than trying at this time to describe the phase equilibrium by 
evaluating the molar Gibbs functions, we want to derive from (15-3) 
another useful and historically significant description in terms of the vapor 
pressure curve which is a plot of the function P(T) as illustrated in Fig. 
15-2a. The point C is the critical point, and it has the coordinates p, and 
T, of Sec. 12-2, as we can verify by comparing Figs. 12-3 and 15-la and 
recalling how the vapor pressure curve is derived from the various 
isothermals. 

To compare the equilibrium between phases 1 and 2 for two neigh- 
boring equilibrium states A and B, as illustrated in Fig. 15-26, we apply 
(15-3) to both A and B and obtain 


gii=g.4 and g,? =g,% (15-4) 


We can also write g,;2 = g,4 + dg;, and, when this is substituted into 
(15-4), we find that 


dg, = dg, (15-5) 
which, with the use of (11-5c) becomes 


If we solve (15-6) for dP/dT, we find that 

— (15-7) 
This result (15-7) is known as the Clausius-Clapeyron equation, and it 
gives the slope of the equilibrium pressure vs. temperature curve in terms 
of the differences in the entropy and volume per kilomole of the two 


C P+dP 


Liquid 


T fi T+aT 
(a) (5) 
Fig. 15-2 
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phases. It bears an interesting resemblance to the Maxwell relation 
(11-7c). 

It is often convenient to define the /atent heat 4 as the heat absorbed 
per kilomole during the transition from phase 1 to phase 2; because of 
(10-27), we can write 


A= TAs = T(5S_ — 5) (15-8) 
and (15-7) becomes IP 4 
—_— = — (15-9) 
dT TAv 


Although we have been constantly referring to the condensation of a 
vapor into a liquid and vice versa as a specific example of a phase transi- 
tion, it is clear from our derivation that (15-3) and (15-7) apply to equi- 
librium between any two phases. Thus the term vapor pressure for P 
applies only when one is discussing the vapor-liquid or vapor-solid 
equilibrium; the corresponding latent heats are those of vaporization and 
sublimation, respectively. 

If we apply (15-9) to the melting of a solid into a liquid, then A is called 
the latent heat of fusion; if we invert (15-9), we obtain for this case 

Ce Pea (15-10) 
dP A A 

where v, and »v, are the volumes of a kilomole of the liquid and the solid, 
respectively. This form of the Clausius-Clapeyron equation is appropriate 
for discussing how the freezing (melting) point changes with pressure. It 
turns out that A > 0 so that, if v, > v,, that is, if the material expands on 
melting, then d7/dP > 0 and an increase in pressure produces an increase 
in the melting point. On the other hand, if v, < v,, the freezing point 
decreases as the pressure is increased; this is precisely the situation which 
applies to water, and it has many practical results. For example, it is used 
to account for the way in which glaciers can apparently “‘flow’’ over and 
around obstacles. The large pressure produced by the weight of the ice 
lowers the freezing point below the actual temperature so that the ice in 
contact with the obstacle melts, flows around the obstacle as liquid water, 
and then refreezes when it reaches a region where the pressure has 
decreased enough for the local temperature to be again below the freezing 
point. 


15-2 The equation of the vapor pressure curve 


Now let us consider the problem of determining the function P = P(T) 
for the important case of the equilibrium between a liquid and its vapor. 
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We could do this by finding the molar Gibbs functions for the vapor and 
liquid, equating them according to (15-3), and solving the resultant 
equation involving only P and 7 for P(T). An equivalent way, of course, 
is by direct integration of (15-9); we shall use this latter method first in a 
very approximate manner. 

If v, is the molar volume of the vapor, then generally there is so much 
expansion on evaporation that we can safely say that v, >> v, and therefore 
we can neglect the volume of the liquid and write 


Av =v, — 0, = 0, (15-11) 


For simplicity and definiteness, let us assume that the pressure is low 
enough that the vapor can be treated as an ideal gas; therefore we find 
from (8-21) that 


RT 
v, =" 15-12 
. (15-12) 
If we substitute (15-11) and (15-12) into (15-9), we obtain 
dP AdT 
— = 15-13 
P- RT? ( 


and if we further assume that A is independent of the temperature we can 
integrate (15-13) to obtain the equation of the vapor pressure curve in the 


form 


In P = - *+ B (15-14) 


where B is a constant. In spite of the many assumptions we made to 
obtain (15-14), it is a surprisingly accurate representation of the dependence 
of the vapor pressure on temperature for many materials. In fact, it 1s 
common practice in many handbooks and other collections of data to 
write (15-14) as In P = —A/T + B and simply tabulate the values of A 
and B for each material. 

However, for our later purposes, we shall need a somewhat more 
accurate formula for the vapor pressure curve which we can obtain by a 
method originally due to Kirchhoff and which involves the use of (15-3) 
directly. If we continue to treat the vapor as an ideal gas, and assume that 
the temperature is high enough that the heat capacities can be treated as 
constant, we find from (11-4c), (9-15), (9-13), and (8-21) that 


g, = ut Pv — Ts =(c, + RT + ty — Ts = c,T + Uy — Ts 
(15-15) 
We also find from (9-43) that we can write s(7, v) for the vapor in the 
jonn s=c,InT+ Rinvct 5, (15-16) 


Part Two. Thermodynamics 135 


where S,9 is a constant needed so that (15-16) will give the absolute value 
of the entropy. If we write s as a function of T and P by using (9-13) and 
(8-21), we find that we obtain 


s=c,InT— RIlnP + 5S, (15-17) 
where 
Spo =Syo + Rin R (15-18) 


If we now substitute (15-17) into (15-15), we find the molar Gibbs function 
for the vapor to be given by 
g,=c,7T(1 — InT) + RT InP + uy — T5y (15-19) 


By neglecting the small amount of work done in the change of volume 
of the liquid, we can also neglect the difference between c, and c, and 
simply write the molar heat capacity as c,. Then we find from (9-5) that 


T 
u; — Uro +| C; dT’ (15-20) 
0 
and from (9-5) and (10-28) that 
T ’ 
c,dT 
s=| — 15-21 
= [24 (15-21) 


Neglecting the term Pv, and substituting (15-20) and (15-21) into (11-4c), 
we find that 
T T ’ 
2, = Ujg +| c,dT' — r| i (15-22) 
0 0 TT" 
If we now equate (15-19) and (15-22), according to the equilibrium 


condition (15-3), we can solve the resulting equation for InP and we 
obtain 


T , 
InP=— ka) +2(e,In T-| dt) 
RT R 0 T’ 
1 Oe toe s 
——(c,—— | c¢,dT’')+— (15-23 
al ae: \ ee 
as the more general equation for the vapor pressure curve, where 


Ao = Uno — Uio (15-24) 


is the molar energy difference between the liquid and vapor at absolute 
zero and therefore is the latent heat of vaporization at T = 0. Equation 
(15-23) could, of course, have been obtained in this form by integrating 
the Clausius-Clapeyron equation, and, conversely, if we were to differ- 
entiate (15-23) with respect to T and take account of all the approximations 
involved, we would eventually obtain (15-9). 
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The feature of (15-23) which will be of the most interest to us is the fact 
that it involves the absolute value of the entropy because of the appearance 
of the term 5,5. Thus in principle one can determine the quantity 5,5 from 
measurements of the dependence of the vapor pressure on the temperature. 


15-3 Higher order phase transitions 


The phase transitions we have been discussing and which are described 
by (15-7) are called first order phase transitions. An essential feature of 
our derivation of (15-7) from (15-5) was the assumption that As and Av 
were both different from zero, that is, that there was a discontinuity in the 
molar entropy and in the molar volume. However, we recall from (11-6c) 


that 
-=(3) «(an 


Op/r 

so that we were actually assuming that the first derivatives of the molar 
Gibbs function were discontinuous. If they were not, we would need a 
more general treatment. The treatment we shall discuss was first con- 
sidered by Ehrenfest and is commonly used as a basis of classification of 
observed phase transitions. 

If we want to compare equilibrium at (7, P) with that at (7 + dT, 
P + dP), it follows again from (15-4) that we must equate 


T+ dT, P+ dP) — g(T, P)= dT dP 
e(T + + dP) — g(T, P) (2) + ap 


+H (28) ant +228 arar + (28) amt] ++ 1526) 


for the two phases, just as we did to obtain (15-5). The second derivatives 
can be written 


=e) (= Cc 
28) (2) eS 15-27 
(5 IP OT /p T ( a 
0’ (= 
= {—] = 15-2 
aTep. \oThp ae) 


Q° 7] 
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with the use of (15-25), (11-20), and (8-2), so that, if we equate (15-26) for 
both phases, use (15-25) and (15-27), and define 


As =5s,—5,, Av=v,—0,, Ac, =Cz2— Cp, etc. (15-28) 


we find the condition for equilibrium to be given by 
—As dT + Av dP + 3 - nts (dT)? + 2A(av) dT dP 
— A(x po) dP) +:°:=0 (15-29) 


If As ¥ O and Av $ 0, we can neglect the second and higher order terms 
in dT and dP so that (15-29) reduces to (15-6) and (15-7); thus we see 
again that the first order transition is characterized by discontinuities in 
the first derivatives of g which appear physically as a latent heat and a 
molar volume change. 

To get a second order transition, we assume that As = 0 and Av = 0 
and that the second derivatives of g are discontinuous; then we can neglect 
the third and higher order terms in a7 and dP in (15-29) so that the phase 
equilibrium condition becomes 


= nce (aT)? + 2A(ow) dT dP — A(xpv) (dP)? = 0 


from which we find the analog of the Clausius-Clapeyron equation to be 


dP _ Aw + [(Aa)® — (Axg Ac,)/To}* 
dT Akr 


since A(av) = v Ag, etc., if Av = 0. We also see from (15-27) that the 
physical characteristics of a second order phase transition are disconti- 
nuities in the molar heat capacity at constant pressure, the thermal ex- 
pansion coefficient, and the isothermal compressibility. 

One could now go on to define a phase transition of the nth order by 
assuming that the nth order derivatives of the molar Gibbs functions are 
discontinuous while all lower order derivatives are continuous. The 
characteristics of such a transition could be discussed by the same general 
method we used above for the first and second order transitions. In 
practice, however, one generally need not consider any possibilities other 
than first or second order. Distinguishing between the two possibilities is 
often quite difficult if one depends only on measurements of the heat 
capacity, since a sharp change and discontinuity in a heat capacity which 
occupies only a small temperature range is very hard to distinguish from 
a latent heat at a fixed temperature. 


(15-30) 
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Exercises 


15-1. Suppose a liquid in equilibrium with its vapor is used as the working 
substance in a reversible Carnot engine. Let the isothermal processes be those 
parts of the isothermals T and T + dT which extend from the / and v points of 
Fig. 15-15. If the corresponding pressures are P and P + dP, find the efficiency 
of this cycle and show that the result leads directly to the Clausius-Clapeyron 
equation. 

15-2. Show that in a first order phase transition the molar energy change is 
given by 


15-3. Assuming that the conditions under which (15-14) was derived are 
applicable, find the equation of the condensation curve in the pV plane. 

15-4. The vapor pressure, in atmospheres, of solid ammonia is given by 
In P = 18.70 — (3754/T) while that of liquid ammonia is In P = 15.16 — 
(3063/7). What are the temperature and pressure at the triple point, that is, the 
state for which all three phases are in mutual equilibrium? Find the latent 
heats of vaporization and sublimation. Find the latent heat of fusion at the 
triple point. 

15-5. When a kilomole of gas is used in a porous plug experiment and 
changed from the initial state i to the final state f, it is found that the fraction x 
is liquefied. Show that 

® = (hy vap — Ay)/(y vap — Ap iiq)- 


15-6. Under what conditions are (15-14) and (15-23) equivalent? 


Part Three 


Kinetic Theory of Gases 


16 Probability and the distribution function 


As we have already mentioned several times, thermodynamics basically 
deals with relations among the experimentally determined macroscopic 
quantities which describe the bu'k properties of matter. The absolute 
values of these quantities are also of interest, of course, and it would be 
desirable to be able to relate them to the individual characteristics and 
behavior of the atomic and molecular constituents of the material. In this 
way we can hope to develop a better understanding of the fundamental 
origins of the thermodynamic properties and laws, while at the same time 
we can obtain theoretical expressions which will enable us to make 
specific calculations. 

The two principal methods which have been developed to discuss the 
relation between the macroscopic and microscopic characteristics of 
matter are called kinetic theory and statistical mechanics. As we shall find 
later, statistical mechanics is the more general in its concepts and methods 
and is correspondingly more difficult. Kinetic theory is much more 
detailed and graphic in its descriptions and therefore is somewhat more 
desirable as an introduction to the general ideas involved in a statistical 
approach; however, since kinetic theory is so very specific in its methods, 
it is easy to deal with only when one is discussing gases and we shall accord- 
ingly restrict ourselves to them. 

Kinetic theory basically looks for its answers in terms of the motions of 
the individual atoms and molecules of which the system is comprised. 
Since the number of molecules in a kilomole of gas equals Avogadro’s 
number, L = 6.025 x 1078 (kilomole)“!, an enormous number of 
mechanical degrees of freedom are involved. On the other hand, we have 
seen that the needs of thermodynamics are adequately met with only a 
very few variables. These macroscopic variables are accordingly assumed 
to be obtainable as averages of appropriate molecular properties. Thus 
our approach can be statistical, and we therefore begin with a brief dis- 
cussion of some of the concepts of probability. 


16-1 Fundamentals of probability theory 
If we let N, be the total number of trials of some event, and N, be the 


number of occurrences of a certain kind, then we define the probability of 
14] 
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the occurrence as 


Pim (16-1) 
N70 N, 
that is, as the limiting value of the ratio of occurrences to trials, and 
assuming that a limit exists. For example, the trials could be the tossing 
of a coin, and an occurrence of interest could be the appearance of the 
head or the tail; then, if the occurrences were due entirely to chance, 
heads and tails would appear equally often on the average so that P = 3 for 
either case. 
There are two fundamental laws or assumptions about the combining of 
probabilities. If we define P, ,,. as the probability that one or the other 
of two mutually exclusive events 1 and 2 will occur, then 


Pio g = Pi + Pe (16-2) 


where P, and P, are the separate probabilities. If we define P, ,,q 2 as the 
probability that both of the independent events 1 and 2 will occur, then 


Py snag = PiP2 (16-3) 


It will often be convenient for us to deal with variables which can 
assume continuous values and to which we want to apply probability 
concepts. For example, let us consider the probability that a point chosen 
at random along the length / of Fig. 16-1a will fall within the segment Az. 
Since all points in / are equally likely, the desired probability is the ratio of 
the length of the segment Az to the total length, or 


Paz ae oi (16-4) 


(5) 
Fig. 16-1 
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We see that the sum of these probabilities is unity, as it should be since the 
point must lie somewhere on the segment by hypothesis; that is, 


Y Pre = pai (16-5) 


It is evident that this result is a general property of a probability, that is, 
the sum of the probability of each possibility over all the possibilities 
must be unity. If we choose Az to be smaller and smaller, we can say that 
the probability of choosing a point in the segment dz is P,, = dz/I. 

If the probability that a variable z will have a value between zx and 
x + dz can be written in the form 


P= w(2) dz (16-6) 


then the function w(z) is defined as the probability density. In the example 
above, w(x) = 1//= const. As another example, let us consider the 
probability that a point picked at random in the circle of Fig. 16-15 will be 
in the ring shown shaded, that is, that the point will lie between the circles 
of radiir andr + dr. In this case the probability is the ratio of the area of 
the ring to the total area of the circle and thus is 


P= = —dr (16-7) 
When we compare (16-6) and (16-7), we see that the probability density in 


this case is 


moa (16-8) 


ry 


and is not a constant as it was for the first example. 
If M(z) is a given function of x, the average value of M is written as M 
and is defined by 


M = | Mezym(a) dx (16-9) 
where the integral is taken over all possible values of x. As an example, 


suppose that the speed u of a particle depends on the variable z. The 
average speed will then be given by 


u = | u(xym(e) dx 
Similarly, the average of the square of the speed will be 


u = | wea dx 


It is clear from the last two expressions that, in general, ue ¥ (ii)*. 
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We can extend these ideas to a situation where we have k variables 
(21, Z_,..., %,) by defining a probability density w which is a function of 
all these variables so that w(%,, 2, ... , X,) dx, dx, + - - dz, is the probability 
that the value of x, is between x, and z, + dz,, that of z, is between x, and 
xt, + dz,,..., and that of xz, is between 2, and z, + dz,. 

It follows from our definitions that0 < P,; < 1, where P, is the probabil- 
ity of a given value of a variable which only takes on discrete values, and 
that the relation 0 < wdz,-:-: dz, <1 holds for continuous variables. 
Also, &,P; = 1 as previously remarked, and 


[o--fute....ed dade = (16-10) 


The result (16-10) is sometimes called the normalization of w, and the 
k-fold integral is taken over all possible values of all the variables. 

The definition of the average value of a function of several variables can 
be similarly generalized as 


M =|: [Mea wee) Ty)W(X,..., X,) d2,°°+ dx, (16-11) 
It follows at once from this definition that 
| M+N=M+N, CM=CM (16-12) 
where C is a constant; we also see in general that MN ¥ (M)(N). 


Example. Isotropic Distribution of Velocities. Let us suppose that we 
have a situation in which the velocities of the molecules in a gas are 
isotropically distributed; by this we mean that there is as much chance 
that a molecule will be going in a given direction as in any other direction. 
In other words, there are exactly as many molecules traveling in one 
direction as in any other, on the average. 

We let P,, dw be the probability that the direction of the velocity 
vector u is in the element of solid angle dw. For an isotropic distribution 
in which all directions are equally likely, the probability cannot depend 
on the specific direction; hence 


P,dw = Kdw, K=const. (16-13) 


We can determine K from the normalization condition (16-10): 


[Pade =l|= K | do = 4nK 
Therefore K = 1/47 and 


d 
P, dw = — (16-14) 


TT 


Part Three. Kinetic Theory of Gases —- 145 


As a simple application of this result, let us find the average of the 
component of the velocity along some fixed direction in space which we 
choose as the z axis. If we let 6 be the angle between u and z, and, if we 
consider first only those molecules with speed u, we find from (16-11) and 


(16-14) that 
= dw 
u, = ucos 6 -| u COs 6( 22) 
_ Ar 


27 fr 
ease | { cosOsin6d6dy=0 16-14’) 
Ga 0 0 


Since this relation holds for any speed, we conclude that u, = 0 in general. 
This is not surprising, of course, and it must be true if the velocity distri- 
bution is isotropic for then there is no net transport of molecules from one 
place to another. 


16-2 The molecular distribution function 


We are able to apply statistical methods to a gas because of the very 
large number of molecules involved. In our description we shall assume 
that the positions of the molecules can be specified, at least in principle. 
We shall also assume for simplicity that any forces between the molecules 
can be neglected except when they are very close together as during a 
collision. The simple model of a gas which we shall adopt is that the 
molecules can be treated as small, smooth, perfectly elastic spheres; thus 
there will be forces of interaction only when the molecules are in actual 
contact, and since they are smooth they cannot transfer rotational energy 
and we need consider only their translational kinetic energy. 

In order to calculate averages, we shall need to know the probabilities 
of the various molecular properties; that is, we shall need to know the 
molecular distribution function f(r, u, t) which we shall define exactly below. 
The molecular velocity will be designated by u, the components by u,, u,, 
u,, and the speed by u. We shall let dr be an element of physical volume, 
for example, dx dy dz when rectangular coordinates are used. Then, by 
definition: 


S(r, u, 0) dr du, du, du, = number of molecules at the time ¢ 
which are in the volume dz at the position r, and whose 
velocity components are in the range u, to u, + du,, u, to 
u, + du,, and u, to u, + du, (16-15) 


It will sometimes be convenient to refer to these molecules as having their 
velocities in the “‘volume”’ du, du, du, which is located at the point u in 
““velocity space.”’ 
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Fig. 16-2 


It follows from (16-15) that 


az|{ f(r, u, t) du, du, du, = ndr 


is the total number of molecules in the volume dr regardless of their 
velocities; therefore n is the number of molecules per unit volume (the 
number density) and is given by 


n(r, t) = |r, u, t) du, du, du, (16-16) 


It is clear that, if we took the volume element dr to be extremely small, 
the density n as measured for the various dr would have large fluctuations 
in it depending on whether there happened to be a molecule in dr or not. 
Since we want to deal with continuous functions, what we shall always be 
doing is choosing our volume elements big enough to contain a large 
number of molecules, yet small enough that there is no appreciable 
variation of the properties of the gas within the volume element. We shall 
continue to choose dr to be extremely small as far as macroscopic dimen- 
sions are concerned. This situation is illustrated schematically in Fig. 16-2, 
which shows the sort of step function we would obtain for n by measuring 
the number of molecules in adjacent volumes; the process we follow then 
is to define a smooth curve for n (and similar quantities) to replace this 
step function. 

Although the time has been included in (16-15) and (16-16) for generality 
in the definition, we shall not be considering situations in which the 
probabilities and average properties are changing with time; that is, we 
shall discuss only “‘steady states.”’ 

We can see from our discussion so far that the problems in kinetic 
theory can be divided roughly into two classes, first, those of finding the 


Part Three. Kinetic Theory of Gases 147 


appropriate distribution function, and, second, those of using the distri- 
bution function to calculate averages which can be compared with experi- 
mental values of the corresponding property. 


16-3 Constant speed gas 


Probably the simplest picture of a gas which can be visualized is one in 
which all molecules have the same speed uy. Since many of our subsequent 
calculations would be greatly simplified if such were the case, we should 
investigate this possibility. In order for such a gas to exist, the molecules 
would have to have the same speeds after a collision as they did before. 
If we consider the particular type of collision illustrated in Fig. 16-3, we 
can show that the speeds do change. 

We assume the rigid elastic spheres to have the same mass m. At the 
instant of collision, the line of centers is vertical as shown by the dashed 
line; both molecules have the same speed uy by hypothesis. After the 
collision, molecule | goes off, making an angle 6 with its original direction 
of motion. Since no external forces are involved, the total momentum is 
conserved, according to (I: 6-10). The z and y components of the momen- 
tum conservation equation are, respectively, 


muy = mu,’ cos 6 (16-17) 
muy = mu,’ sin 8 — mu,’ (16-18) 

The equation expressing conservation of energy is 
d4mu,? + 4mu,2 = 4mu,'? + 4mu,? (16-19) 


The speeds can be eliminated from these equations, and we find the 


Before 


Fig. 16-3 
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resulting condition on the angle 6 to be 
sin 6 (cos 6 — sin 0) = 0 (16-20) 


The first possibility that sin 6 = 0 and therefore 6 = 0 represents a 
complete miss and is of no further interest to us. The condition cos 0 = 
sin 6 which follows from (16-20) tells us that 6 = 45°; when this value of 
6 is inserted into (16-17) and (16-18), we find that 


uy) = V2u, wm’ =0 (16-21) 


which shows that the speeds of the molecules are definitely changed as a 
result of this particular collision. Since it is reasonable that this type of 
collision, as well as all other possible types, will frequently occur, we are 
forced to conclude that it will be impossible for a gas to consist of molecules 
which all have the same speed. In addition, we shall have to assume the 
possibility of all speeds from zero to infinity being found in the gas. 


Exercises 


16-1. By means of either a few examples or a general discussion based on 
counting of possibilities, verify the rules (16-2) and (16-3). 

16-2. Show that the definition (16-9) is the same as the usual arithmetic 
definition of an average as the quotient of a sum divided by the number of 
terms. 

16-3. Find the probability density which gives the chance that a point chosen 
at random within a sphere of radius rg will lie within the thin shell bounded by 
spheres of radii r and r + dr. 


I Pressure of an ideal gas and the equation 


of state 


It is possible to obtain useful results simply from the knowledge that the 
distribution function f exists, without the necessity of knowing its specific 
form. We shall illustrate this fact for a particular example and postpone 
the detailed evaluation of f to the next chapter. 

We want to calculate the pressure exerted on the walls of the container 
by an ideal gas, which we have defined for kinetic theory purposes as one 
in which the molecular forces can be neglected except at collision. For 
simplicity we shall assume that there is only one type of molecule. We 
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shall consider a small portion of the bounding surface which we can 
treat as a plane surface of area AA, and we shall calculate the pressure p 
times the area AA as the time average of the force F, which is perpen- 
dicular to the wall and arises from the collisions of the molecules with 
the wall. If we write the time interval involved as At, then 


p AA = (F 1)gas on wall = es [ F\ dt (17-1) 
At Jat 


Since the force exerted by the wall on the gas is the negative of F, as 
shown by (I: 3-10) and equals the time rate of change of the momentum 
P , of the gas, we have 


so that (17-1) becomes 


pAAAt -| 


At 


(- e+) dt= —AP, = (P,), — (P 1); (17-2) 


where AP, = (P,), — (P,); is the net change of momentum of the gas as 
a result of the collisions. 

In order to calculate AP ,, let us consider first the group of molecules 
which are approaching the wall and whose velocity vectors equal u on the 
average and lie in the range du, du, du,; the direction of u makes an angle 
6 with the normal to the wall where 0 < 6 < 7/2, as illustrated in Fig. 17-1. 
We construct a cylinder on AA whose cross section is shown and whose 
generators are parallel to u and of length u Ar; therefore all the molecules 
contained in this volume will strike the wall in the time Az. Since the 
volume of the cylinder is u At cos 6 AA, the total number of molecules of 


Fig. 17-1 
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this velocity type which collide with the wall in At is found from the 
definition (16-15) to be 


f-uAtcos 6 AA - du, du, du, (17-3) 


If m is the mass of each molecule, the perpendicular component of 
momentum of each molecule of this type is mu cos 0, and therefore the 
total momentum brought to the wall by molecules of this velocity group is 
found by multiplying the contribution of each by the total number given by 
(17-3) and is 

mu cos 6: f- u At cos 6 AA: du, du, du, (17-4) 


If we now integrate (17-4) over all velocities possible for the molecules 
incident upon the wall, we find the initial perpendicular momentum to be 


(P,);=mAA at u* cos? 6 f du, du, du, (17-5) 
0<0<7/2 
To calculate the momentum carried away from the wall we proceed in a 
similar manner and erect a volume on AA with generators of length u At 
and parallel to u which is now directed away from the wall so that 7/2 < 
8< 7. The calculation is done exactly as above except that now the 
volume of the cylinder is 


u At |cos 0] AA = —u At cos 6 AA 


since cos @ is negative in this case. Therefore we find that 


(P1), = —mAA at| u* cos” 6 f du, du, du, (17-6) 
7/2<0<9r 

If we now insert (17-5) and (17-6) into (17-2), cancel out the common 

factor AA At, note that the integrations over 8 can be combined into a 

single one over the whole possible range of 6 since the integrands are the 

same, and use the definitions (16-11), (16-15), and (16-16), we find that the 

pressure is given by 


p m | (u cos 6)°f du, du, du, 


nm| 2 ic cos 0)*f du, du, du, | = nm(ucos 6)? (17-7) 
n 


If we let u, = u cos @ be the component of velocity perpendicular to the 
wall, we can write (17-7) as 


p=nmu,? (17-8) 
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We note again that we obtained this general result without knowing the 
specific form of the distribution function f. 

If we now assume in addition that the velocities have an isotropic 
distribution so that all directions are equivalent, we have 


=u? =u? =u? = Hu? + uj? +42) =? (17-9) 


and (17-8) then becomes _ 
Pp = 3nmuv? (17-10) 


If we let ¢, = 4mu? be the average translational kinetic energy of a 
molecule, we can also write (17-10) as 


p= tne, (17-11) 


Although our results so far have given us a kinetic definition of pressure, 
we can easily use them to obtain a kinetic theory definition of temperature 
as well. If Nis the total number of molecules in the volume V, the number 
density n is 


N 
r=— 17-12 
7 (17-12) 
and (17-11) can also be written as 
pV = 4Ne, (17-13) 


We also have N = vL, however, where » is the number of kilomoles and L 
is Avogadro’s number, so that (17-13) becomes pV = v §Le, which is the 


same as the ideal gas equation of state, pV = RT, provided that gLe, = 
RT or that 


= 3() T= : kT (17-14) 
2NL 2 
The constant 
k= : = 1.38 x 10°” joule/degree (17-14) 


is known as the Boltzmann constant or the gas constant per molecule. 
Thus we see from (17-14) that the average translational kinetic energy per 
molecule is kT which we can take as our kinetic definition of temperature. 
If (17-14) is substituted into (17-13), we obtain another useful form for the 
equation of state in terms of the total number of molecules: 


pV = NkT (17-15) 
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We can also use this result to calculate some of the thermodynamic 
properties of a monatomic ideal gas, something we were not able to do 
before. Since the only energy is due to the translational energy, the energy 


per kilomole is _ 
u= Le, = 3RT = u(T) (17-16) 


which is a function of the temperature only. We can use (9-14) and (9-13) 
to find the molar heat capacities; the results are 


c, = : R, c,=-R, @= ° = 1.67 (17-17) 


These numerical results agree very well with experiment. 
If we solve for u? from (17-10), (17-11), and (17-14), we find that 


ees (17-18) 
m 


Lt 


where « = Lm is the molecular weight. We can use this result to obtain 
an idea of the actual magnitude of molecular speeds. If we consider 
helium, for which u = 4 kilograms/kilomole, at a temperature T = 273 
degrees, we find that the root-mean-square (rms) speed 

= (u2)% (17-19) 


Urms 


is 1300 meters/second. We also see from (17-18) that the molecular speeds 


increase with temperature since u,,,, ~~ VT, making the higher tempera- 
tures correspond to greater average energy of molecular motion. 


Exercises 


17-1. Extend the calculation of the pressure to the case in which more than 
one type of molecule is present, and thus obtain Dalton’s law (8-22). 

17-2. Find the temperature at which the average translational kinetic energy 
of an atom equals that of a singly charged ion of the same mass which has 
passed through a potential difference of one volt. 

17-3. Show that the number of molecules per unit area per unit time which 
leak out through a small hole in the wall of the container is given by 


Ty = ini (17-20) 
17-4. Consider a two-dimensional gas in which the molecules are constrained 


to move in a plane. Show that the rate at which molecules strike the boundary 
per unit length is 1,i/7, where n, is the number of molecules per unit area. 
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18 Macwell’s velocity distribution function 


In considering the distribution of molecular velocities apart from the 
question of the variation with position as described by the density n, we 
shall substantially be following the method used by Maxwell, who first 
obtained the correct answer to this problem. 

We write fin the form of a product 


f(r, u, t) = nr, 1) F(u, r, 1) (18-1) 
and we see that the meaning of F is that 


F(u, r, t) du, du, du, = the fraction of the molecules at r 
whose velocity components are in the range du, du, du, 
about u,, u,, u,. (18-2) 
If we consider only gases at equilibrium, F will not depend on r or ¢, and 
it can be a function only of the components of u; that is, F = F(u,, u,, u,). 
We assume first of all that the velocity distribution is isotropic and 
therefore F can be a function only of the speed, 


F = F(u) (18-3a) 
The second assumption we make is that the distribution of the z com- 


ponent is independent of the distributions of the y and z components, etc., 
so that we can write 


Flu) = F(u,)F (u,)F (uz) (18-35) 
These two assumptions are sufficient to determine the form of Fand of F 
uniquely. 
Taking the logarithm of (18-35), we obtain 
In F(u) = In A(u,) + In F(u,) + In A(u,) (18-4) 
and, if we differentiate this expression with respect to u,, we obtain 
OlnF _dinF ou _u,dinF _ dilnF(u,) (18-5) 
du, du du, udu du, 


since u = (u,2 + u,? + u,?)*. Two more equations like (18-5) can be 
found by differentiating (18-4) with respect to u, and u,; by dividing them 
through by u,, u,, and u,, respectively, we find that 


din F = d In F(u,) = dln F(u,) = din F(u,) (18-6) 
udu u, du, u, du, u, du, 
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These functions of different variables in (18-6) can be always equal to each 
other only if they equal the same constant. Letting this constant be 
written as —2y, we find from (18-6) that 


d\n F(u,) ees 


2 
u, du, : 


and therefore 
In A(u,) = Ina — yu,? 


where a is another constant; finally we obtain 
F(u,) = ae (18-7) 


with similar expressions for A(u,) and A(u,). Substituting these results 
into (18-35), we find 


F(u) = qheVWueltuy tus) — gSe-vu" (18-8) 


The constant a can be evaluated from the normalization condition since 
the speed must have some value; therefore, if we use (18-2) and (16-10) 
and the fact that the #’s all have the same form, we obtain 


{ F(u) du, du, du, =1= (< [io at) - (< Jy 


if we introduce € as a general variable of integration. We also used the 


general formula 
” pn Dia(n + 1) 
If ee’ dé = Voy Aln+) (18-9a) 


involving the gamma function which also satisfies the relations 
Te) =(e —IT(@—1), TH) =V7 (18-95) 
Therefore a = \/y/m and (18-7) and (18-8) become 
“4 % 
F(u,) = (2) enue, F(u) = (?) env (18-10) 
WT us 


y can be determined by calculating u? and using (17-18); we find from 
(18-9) and (18-10) that 


u? = | | | u?F(u) du, du, du, 


y 36 P2n Pr Po 2 
= (2) | | u’e—™ u* sin 6 du dé dp 
7 0 “0 


0 
34 
= an(2) | eee dine (18-11) 
0 dy 
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where we introduced spherical coordinates in velocity space in order to 
perform the integration easily. Combining (18-11) and (17-18), we obtain 


a, 3kT 
i 
y m 
so that 
m 
as 18-12 
eS ae ( ) 
and therefore 
F(u,) = (tayo (18-13) 
. 2rkT. 
“% z 
Fu) = (Jemima (Vener 18.14) 
2n7kT 2arkT 


are the final forms of the functions which we have obtained by combining 
the experimental law for the pressure with a very general derived expression 
for the pressure. This result (18-14) is Maxwell’s velocity distribution 
function. We shall be able to derive it in a more satisfactory way during 
our discussion of statistical mechanics, although a more precise derivation 
of (18-14) can be obtained by purely kinetic theory methods and concepts. 
The latter procedure was first carried out by Boltzmann, who studied the 
effect of molecular collisions on the distribution function and was able to 
show that the distribution (18-14) is, on the average, unchanged by 
collisions and therefore would be constant in time, as it should be in 
order that it represent an equilibrium distribution. 


18-1 Distribution functions for the components and speed 


Now let us try to find what fraction of the molecules have an x com- 
ponent of their velocity between u, and u, + du,; we shall write this 
fraction as h(u,) du,. We can see from the u,u, plane of velocity space 
shown in Fig. 18-1 that all vectors u whose end points lie in the slab of 
thickness du, perpendicular to the u, axis at u, have the x component of 
interest. Thus we can use (16-2) and find our fraction A(u,) du, by 
summing F(u) du, du, du, over all values of u, and u, while keeping u, 
constant; with the use of (18-10) we then obtain 


“2 2 - 2 i 2 
h(u,) du, = (”) ets du.| ey du, [ e’“s du, 
WT —00 


—o 


1T 


= (2) ‘erm du, = F(u,) du, (18-15) 
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Fig. 18-1 


F (ux) 


Fig. 18-2 


Fig. 18-3 
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Since there will be similar equations for the other components, we see 
that the functions F (u,) which were introduced in (18-35) have the physical 
significance that each is the distribution function for a single component. 

The function A(u,) has the familiar form of the Gaussian error curve 
and has the general appearance shown in Fig. 18-2. It is apparent from 
the figure that the most probable value of a component is zero; since the 
curve is an even function, the average value of a given component is also 
zero, that is, Up = 0. The physical reason for this, of course, is that it is a 
consequence of our initial assumption of an isotropic distribution for 
which all directions are equally probable. 

Let the fraction of the molecules which have a speed between u and 
u + du be d(u) du. We see from Fig. 18-3 that this distribution function 
can be obtained by finding the total fraction represented by all the vectors 
u whose end points in velocity space lie within the two spheres of radii 
u and u+ du. Thus we want to sum F(u) du, du, du, over the complete 
solid angle while keeping the value of u constant; if we use (18-14) and 
write the volume element as u? dudw where dw is an element of solid 
angle in velocity space, we obtain 


44 
d(u) du = { F(u)u? du dw = 4n(—_) ure KT dy (18-16) 
@ 27kT 
This distribution is no longer Gaussian and is not symmetric with respect 
to the most probable speed since u can have only positive values. The 
general dependence of ¢(u) on u is shown in Fig. 18-4. 
The most probable speed u,, is found by calculating df/du from (18-16) 
and then setting df/du = 0 to find the value of u which corresponds to the 
maximum of ¢. The result is 


uy = (27)" (18-17) 


b(u) 


| 
| 
| 
| 
| 
u 


Uw Urms u 


Fig. 18-4 
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The average speed a is different from this and is obtained from (18-16) and 
(18-9) as 


4 
=f) (18-18) 


Ti = [Pug du = (= 


The rms speed is given by (17-18) and (17-19). One can easily show that 


“= a uy = 1.13 Uy, Urms -,/3 uu, = 1.22 Uy (18-19) 


and therefore that u,, << @ < u,,,, as illustrated in Fig. 18-4, although the 
relative location of these three quantities on the curve has been exaggerated 
in order to show them separated. 


Exercises 


18-1. Verify (18-17) and (18-18). 

18-2. Find the value of (1/u) for a Maxwell distribution. 

18-3. Calculate the total x component of momentum transported per unit area 
per unit time in the positive x direction across the yz plane within a gas in 
equilibrium, and interpret the result. 

18-4. Gas leaks out slowly through a small hole in a container and into a 
vacuum. Find the rate per unit area at which energy is transported out the 
hole. With the help of (17-20), show that the average energy of an escaping 
molecule is 2kT. How do you account for the fact that this energy is greater 
than the average energy of a molecule in the container? 

18-5. Show that the normalized distribution function which gives the fraction 
of the molecules whose translational kinetic energy is between ¢, and & + de, is 


ie ae Ve, e—ukT de, (18-20) 


Find the most probable energy and the average energy. 


19 Mean free path phenomena 


Because in our specific calculations so far we have neglected the finite size 
of the gas molecules, we have not explicitly needed to consider the collisions 
of the molecules with each other. To find what effect the mutual collisions 
of the molecules may have on the macroscopic properties of the gas, we 
begin with a concept introduced by Clausius which is extremely useful in 
characterizing the effect of collisions. 
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19-1 The mean free path 


The free path of a molecule is defined as the distance traveled by it 
between two successive collisions; the mean free path lis similarly defined 
as the average distance traveled by a molecule between two collisions. In 
an ideal gas where the molecules are treated as points, the mean free path 
is infinite and we need only discuss collisions with the walls as we did when 
we calculated the pressure. We shall consider first a special case in order 
to illustrate the general approach which can be used in dealing with 
molecules of finite size. 


Example. Collisions with Fixed Molecules. Suppose the gas molecules 
are colliding with other molecules which are located at fixed positions— 
as would be the case, for instance, for neutrons in paraffin or in a pile. 
If the radius of the moving molecule is R, and that of the fixed molecule 
is R,, the distance between their centers at collision is 


s=R, +R, (19-1) 


as illustrated in Fig. 19-la. The problem is therefore the same as if the 
moving molecules had an “effective’’ radius s and were colliding with 
fixed point molecules. Thus a molecule of velocity u will have swept 
out a volume zrs?u At in a time At as seen from Fig. 19-15. If the density 
of fixed molecules is n, the number contained in this volume will be 
n7s*u At. Since every fixed molecule in this volume will correspond to 
a collision with our representative moving molecule, we can say that 


Average number of collisions in At = nas*u At (19-2) 


On the other hand, the average number of collisions also equals the 
total distance traveled divided by the average distance between collisions 
(the mean free path /); hence we also have 


At 
Average number of collisions in At = — (19-3) 


Equating (19-2) and (19-3), we find / to be given by 


ae 


(19-4) 


where o = 7s? is the effective area for a collision and is called the cross 
section for collision. We note that (19-4) is independent of u. 
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Fig. 19-1 


Of even more interest to us is the mean free path for the mutual collisions 
of gas molecules having a Maxwell distribution of velocities. This is a 
much harder quantity to calculate, and we shall only quote the final result 
which is 

1 _ 0.707 


J2 ns? ns? 
Thus, if we are primarily interested in order of magnitude calculations we 
can use either (19-4) or (19-5) for our estimates. 

Let us find the approximate value of / for a monatomic gas under 
reasonable conditions. We use s~ 2 x 10~1° meter, which is typical of 
atomic sizes; we also use the value of at the standard conditions listed 
after (8-19)—called Loschmidt’s number ny = 2.69 x 10° (meter)~?. 
Inserting these numbers into (19-5), we find that /~ 2 x 107? meter ~ 
10°s. We can compare this with the average distance between the mole- 
cules, /), which we can obtain by equating /,° to the average volume per 
molecule so that /,3~ 1/n). Thus |, ~ nn) “= 3.3 x 10-® meter ~ 15s 
and /=~ 70/). If we use the value previously found for the average speed 
in helium, we find the average time between collisions to be about 


I (19-5) 


T= L ~ 1.5 x 107! second 


u 


so that the frequency of collisions is about 1/7 ~7 x 10° (second) }. 
These results show that the molecules of an ideal gas undergo collisions 
at an enormous rate and that the molecule travels with its very large 
average speed only for the short time between collisions; during this time 
it travels about one hundred times the average distance between molecules 
before a collision abruptly alters its direction of motion. 

The utility of the mean free path will become evident from the following 
examples. 
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19-2 Viscosity 


We shall consider first the situation in which the gas has a mass motion 
corresponding to a velocity in the x direction of speed v, which will be in 
addition to the velocity of thermal motion which is described by the 
Maxwell distribution; therefore u, = v while uw, and wu, are still zero. We 
shall also assume that v = v(y) so that the 
mass velocity changes in a direction per- 
pendicular to the net flow as illustrated in 
Fig. 19-2. In Fig. 19-3, we see how a rec- 
tangular parallelepiped enclosing a definite 
portion of the gas will have its shape altered 
as one follows this portion along in its u(y) 
motion. This situation can also be described 
in terms of the various horizontal layers of 
the gas dragging across each other and 
exerting forces on the adjacent layers as 
indicated by the arrows in the dashed paral- 
lelogram. The net result is a tendency to 
slow down the faster layers and speed up Fig. 19-2 
the slower layers, so that there is a shearing 
stress in the gas, that is, a force F, which is tangential to the surface of 
area A which is perpendicular to the direction in which the mass velocity 
is changing and which bounds this portion of the gas. It is found ex- 
perimentally that the stress, that is, the tangential force per unit area, is 
proportional to the gradient of the mass velocity; thus we have 

Fe _ po 
A oy 
where the proportionality factor 7 is called the coefficient of viscosity or 
simply the viscosity. 

Since viscosity has many of the attributes associated with friction, it 
will be helpful in understanding both the origin of viscosity and our 


(19-6) 


A 


Fig. 19-3 
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Fig. 19-4 


subsequent calculation of 7 to consider the following question: How can 
there be viscosity in an ideal gas in which the only processes we consider 
are perfectly elastic collisions between smooth molecules? In order to 
answer this question, let us consider a rather rough analogy. Suppose we 
have two very long trains running along parallel tracks but with different 
speeds v, > v, as shown in Fig. 19-4. Now suppose also that at a given 
instant the passengers in each train begin firing machine guns at the other 
train, aiming so that the bullets are fired perpendicular to the motion of 
the train as indicated by the short arrows in the figure. Assuming that 
when the bullets hit they lodge in the train and are carried along with it, 
we see that each bullet which strikes train 2 has its momentum in the 
direction parallel to the track suddenly increased by m(v, — v,). Therefore, 
if @ is the rate at which bullets are striking train 2, the force which the 
train must exert in order to continue moving at speed v, equals the rate of 
change of the momentum of the bullets and is given by 


F, = m(v, — v,)A 


If the train is not able to exert this force, the net effect will be to slow the 
train down. Similarly, the bullets of higher momentum coming from 
train 2 and striking the lower momentum region of train 1 tend to speed 
up the slower moving train 1. The eventual result of this process would be 
to make both trains travel at the same speed—a result similar to that 
expected from viscosity. 

If we look again at the layer in the gas shown shaded in Fig. 19-3 and 
recall that the molecules above the layer have a greater x component of 
momentum than those below, we see that, as the gas molecules move 
about, the net effect will be to give the gas below the layer an increase in 
momentum. Therefore, in order to find the viscosity, we need to calculate 
the net transport of z component of momentum, and, in particular, we 
want the net gain of momentum of the material below a given plane which 
we choose at y = 0 as shown in Fig. 19-5a. 

We shall assume, for simplicity, that all molecules passing through the 
plane have come from a distance equal to the mean free path. Thus, in 
effect, they have originated from the surface of the sphere of radius / from 
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where their last collision directed them across the y = 0 plane. Therefore 
a molecule coming from above and traveling at an angle 0 from the y axis 
brings down to this plane an z component of momentum given by 


Pa = mv(y) = mo(I cos 6) ~ mv(0) + ml cos 6( 2) (19-7) 
y/o 


Similarly, for a molecule traveling upward at the corresponding angle 9, 
we have 


DP, = mv(—I cos 6) ~ mv(0) — ml cos 6( 2) (19-8) 
yl 


For every downward molecule of this velocity type there is an upward one 
since there is no mass motion in the y direction; hence the net increase of 
z momentum of the gas below the plane for each downward moving 
molecule is B 
Ap = Pa — Py, = 2ml cos 6( 2) (19-9) 
dy/o 
In order to find the number of downward moving molecules in a time 
interval At, we construct the cylinder on the area AA in the zz plane 
shown in Fig. 19-55; as before, the generators of the cylinder are parallel 
to u and of length u At and the number of molecules of this type which 
have crossed AA in At equals the number in the volume. Using (16-15), 
(18-1), (18-14), and (18-16), we find this number to be 


u Atcos 0AA:nF -: du, du, du, 
= AtAA-nu cos 6: Hu) | u’sin@dudO@dqg (19-10) 


Anu" 


When we multiply (19-9) and (19-10), we obtain the net increase of 
momentum due to these molecules. If we now integrate this quantity over 


(5) 


Fig. 19-5 
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all possible velocities for downward moving molecules, we find the total 
transfer of x component of momentum to the material below the plane to 
be given by 


PAthA=AtAA“” um (2 


oy 
where P is defined as the total rate of downward transfer of x momentum 


per unit area. If we neglect any dependence of / on speed, we find from 
(19-11) that 


fo8) 7/2 27 
=) | lug(u) du | cost # sin 60 | dy (19-11) 
0 0 0 


P= : nml( =) [usw du 
=o oe 


Since force is rate of change of momentum, we also have P = F,/A = 
n(Ov/dy)) because of (19-6); combining the last equation with (19-12), we 
find the viscosity to be given by 


n = tnmli (19-13) 


Since this result involves /, we see that measurements of the viscosity 
will enable us to evaluate the mean free path—something which is 
obviously impracticable to do directly from its definition. In addition, we 
can use this value of / to find the cross section for collision and then the 
molecular radius. Measurements of this kind were among the first which 
gave quantitative information about molecular sizes. 

The dependence of 7 on the density, indicated by (19-13), is only 
superficial because we have seen in (19-5) that the mean free path is 
inversely proportional to 1 so that the product n/ is independent of the 
density or, equivalently, independent of the pressure because of (17-15). 
Hence the kinetic theory quite unequivocably predicts that the viscosity of 
an ideal gas is independent of the density. This prediction has been verified 
experimentally over a considerable range of densities and represents an 
outstanding achievement in the historical development of kinetic theory. 
This independence of the density may seem surprising at first because one 
might think that the ‘‘friction’’ might decrease as the amount of gas 
decreases and hence that the viscosity would also be less. But if we recall 
the origin of viscosity we can see now that if, for example, we double the 
density, the number of molecules carrying excess x momentum across a 
given plane will also double; the molecules come, however, from an 
average distance only half as far away, hence their excess mass velocity is 
only half as great as before—the net result is that the excess x momentum 
will be unchanged and therefore the viscosity will be the same. 
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We can also see, however, that this prediction actually cannot hold for 
extreme values of the density. If the density is very large, the gas mole- 
cules are comparatively close together and the short range intermolecular 
forces which we have been neglecting become important. At the other 
extreme where the density is very small, the mean free path becomes 
comparable to or larger than the dimensions of the container; the col- 
lisions with the walls then may become more important than collisions 
with other molecules, and our method of calculation will no longer be 
applicable. 

The result (19-13) does make a prediction about the temperature 
dependence of the viscosity because, if we also use (18-18), we see that 


n~win~J/T (19-14) 


Therefore the viscosity increases with the temperature. This behavior is in 
contrast to that of liquids whose viscosity generally decreases rapidly as 
the temperature increases, the proverbial example being molasses. The 
temperature dependence given by (19-14) is in fair agreement with experi- 
ment; we can see that (19-14) cannot be entirely correct because the 
molecules are not completely rigid and elastic as we have been assuming— 
this effect can be expected to make the effective size at collision as well as 
the mean free path depend on speed, and therefore the temperature will 
be involved in a more complicated way than in (19-14). 


19-3 General transport and thermal conductivity 


If we now define momentum flow to be positive if it is in the positive y 
direction and let P’ be the total average rate of momentum transfer per 
unit area, (19-12) takes the form 


5 1, [ a(mv) 
P= ! nial 7 | (19-15) 


Suppose we now consider any general quantity G which can be trans- 
ferred by the motion of molecules and also define a net rate of transfer of 
G per unit area which we write as I’. We can now find I‘ by simply 
repeating the calculations which led to (19-15) and replacing mv by G; 
the result is 


f= nta( 22) (19-16) 
0 


We can make an immediate application of this general transport equation 
to the calculation of the thermal conductivity of an ideal gas. 
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If there is a temperature gradient in a gas so that T = T(y), then (17-14) 
becomes ¢, = 3kT7(y) and there can be a net transfer of energy because of 
the motion of the molecules. If we let the corresponding rate per unit 
area be represented by Q, we can immediately find O from (19-16) by 
letting G = ¢,; therefore 


1 (2) (27) 
= ——nli = —lknis 19-17 
2 3 oy 2 dy /o ( ) 

The coefficient of thermal conductivity K is defined by 
O= -« (22) (19-18) 

dy /o 
so that we have 

K = tknli (19-19) 


We see from this result that the thermal conductivity has the same 
properties as the viscosity, that is, K is independent of the density and 


proportional to 7; both of these predictions have been quite well 
confirmed by experiment. 
If we divide (19-13) by (19-19), we find that 


n 2m (19-20) 


This simple and interesting expression is reminiscent of thermodynamic 
results and has the virtue that it is independent of / and a and hence of 
many of the simplifying assumptions we have made. We can put (19-20) 
into a more useful form by multiplying numerator and denominator by L, 
using (17-14’) and (17-17), and introducing the molecular weight « = Lm; 
in this way we find that 


Ie — | (19-21) 


which is an expression making a definite prediction about this particular 
combination of directly measurable quantities. 

When the expression on the left in (19-21) is evaluated from experi- 
mental results on some monatomic gases, we find that it has the following 
values: for helium, 0.402; for neon, 0.424; and, for argon, 0.404. 
Although the value is not unity as predicted by (19-21), we do see that it 
is remarkably constant and hence is strong evidence for the general 
validity of our theory, which is what first led us to consider looking at this 
particular combination of quantities. That our predicted numerical value 
is 1 rather than about 0.4 is a consequence of the extreme simplifying 
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assumptions we have made in deriving our formulas. A more elaborate 
theory which evaluates the transported quantities more accurately, and 
which also takes into account the fact that the gas molecules are not 
completely elastic, results in a different numerical factor which is more in 
accord with experiment. 

Viscosity and heat conduction are both examples of non-equilibrium 
phenomena. Thus we see that kinetic theory enables us to deal with this 
type of problem as well as purely equilibrium situations. We shall not 
discuss the specialized methods of kinetic theory any further, however, but 
instead we shall again consider the problems associated with equilibrium, 
but in much more general terms than before. 


Exercises 


19-1. Verify (19-16) in detail. 

19-2. Show that the viscosity of a gas which is constrained to move in two 
dimensions is given by 7 = 4n,mla. 

19-3. If there is a concentration gradient in a gas, there will be a transport of 
molecules given by I',- = —D (én/@y), where D is called the diffusion coefficient. 
Show that D = 4/a and therefore that D = 7/nm. What is the value of K/D? 

19-4. A container of gas is separated into two parts by a thin wall with a very 
small hole in it. If the two parts are kept at different pressures and temperatures, 
show that there will be no net flow of molecules through the hole when 


Pil VT, = Pe VT, 


Part Four 


Statistical Mechanics 


20 Fundamental principles 


Our brief discussion of kinetic theory has shown us that we are able to 
discuss states of thermodynamic equilibrium by means of statistical 
considerations. We were able to derive the equation of state of an ideal 
gas and to calculate the absolute values of its heat capacities. From this 
experience we can reasonably conclude that probability considerations 
will play a vital role when we discuss other systems by less specialized 
methods. On the other hand, the concept of entropy which is so important 
in thermodynamics is virtually absent from much of kinetic theory. The 
principal contribution of Boltzmann to the creation of statistical mechanics 
was his showing that there is an intimate connection between the two 
concepts of entropy and probability. 

As we saw in the examples of Sec. 16-1, however, we cannot make any 
calculations of specific probabilities until we have made some assumptions 
about what corresponds to events of “‘equal probability.” In all our 
kinetic theory calculations, we assumed without hesitation that the 
probability of finding a molecule in a certain range of position and velocity 
was given by 


£ dx dy dz du, du, du, 


according to (16-15). Thus we assumed that the probability was propor- 
tional to the size of the volume element in the six-dimensional space we 
effectively used to describe the motion of a single molecule. In other 
words, we assumed that equal volumes have equal probability, in so far as 
their size alone is concerned. This is a very simple idea, and it seems as if 
it would be an attractive starting point for our consideration of general 
systems. The difficulty which immediately arises, however, is: In what 
space does equal size correspond to equal probability? A hint about the 
appropriate space to use can be obtained from Hamiltonian mechanics. 


20-1 I'-space and ensembles 


Let us begin by considering a conservative system which has a total of 
WN degrees of freedom; for example, if it were comprised of a total of N 
particles, each of which had / degrees of freedom, then 


N =IN (20-1) 
17] 
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Pp 


(a) (b) 
Fig. 20-1 


The equations of motion of the system are Hamilton’s equations 
P=-=:> g=— Cie yee (20-2) 


according to (I: 10-7). The Hamiltonian function H is numerically equal 
to the energy and is written as a function of the generalized coordinates q; 
and the momenta p,; that is, H = H(p,,...,Py,9»--+>4y): 

We can now define a 2./-dimensional space which has the p, and q, as 
coordinates; this space is called I-space or phase space. If our system of 
interest is a kilomole of gas, for example, then WY ~ 107’. Since the 
instantaneous State of the system is defined by the 2.V values of the p, and 
g,, we see that the instantaneous state can be represented by a single point in 
I'-space whose coordinates are these 2.VY quantities. As time goes on and 
the state of the system changes so that the p, and q, change, this representa- 
tive point traces out a path in I’-space as illustrated schematically in Fig. 
20-la, where p and q represent a particular two-dimensional projection of 
I’-space. 


Example. One-Dimensional Harmonic Oscillator. The Hamiltonian of 
this system of one degree of freedom is 


2 
H(p,q) =~ + . mw'q’ = € (20-3) 
2m 2 


according to (I: 10-10 and 5-9), where m is the mass, w = 27+ is its 
circular frequency, and « is its energy. If we write (20-3) in the form 
2 2 
P q 
—— 20-4 
2me = (2e/mw*) or 
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we see that the path of the representative point of this system in its 


two-dimensional I’-space is the ellipse of semiaxes ./ 2me and /2e/mw? 
shown in Fig. 20-15. 


The physical quantities which we measure and which are of interest to 
us for a system in thermodynamic equilibrium almost always are time 
averages of suitable properties of the system which are averaged over the 
portion of the path in I’-space corresponding to the time interval devoted 
to the measurement. We know from experience that such averages are 
reproducible later provided that external conditions are unchanged—we 
keep the temperature constant, for example. It is extremely difficult to set 
up mathematical machinery for calculating these time averages for a 
general system, and, even if we could perform the calculation in principle, 
it would be essentially impossible in practice because we can never hope to 
know with any degree of certainty all the very many initial conditions of 
the motion which would be required. 

Since a direct attack on the problem of interest to us appears to be 
impossible, we can ask whether it might not be possible to devise some 
other kind of averaging process which would be equivalent, for computa- 
tional purposes, to the one which we need for comparison with experiment. 
A very great contribution to statistical mechanics was made by Gibbs, who 
suggested that we imagine a group of similar systems which are suitably 
chosen and have appropriate random properties and then that we calculate 
our averages over this whole group at a given time rather than find a time 
average for a single system. Such a group of similar systems is called an 
ensemble, It is an intellectual construction which is to simulate and to 
represent, at one time, the properties of the actual system of interest as 
they are developed in the natural course of time. The properties of the 
ensemble must be so chosen as to reflect as accurately as possible whatever 
knowledge we do have of the system. 

An ensemble is imagined to be composed of very many systems which 
are all identical in type to the system of interest, that is, they are all 
described by the identical form of Hamiltonian; the members of the 
ensemble do differ, however, in their initial conditions. As a result of 
these requirements, we see that the members of an ensemble which have 
almost equal energies will be macroscopically indistinguishable. Since 
each member of an ensemble can be represented by a single point in 
I’-space, the ensemble as a whole is represented by a collection of points in 
I’-space; these points can be assumed to have an almost continuous 
distribution for an ensemble with the extremely large number of members 
which we are visualizing. Since all the systems have the same Hamiltonian, 
they also have the same equations of motion; thus, as time goes on, these 
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various points in I-space will all trace out their individual paths as 
determined by their different initial conditions. 

Thus the basic idea of Gibbs is that of replacing the calculation of the 
time average for a single system by an ensemble average at a fixed time. 
The problem of demonstrating the equivalence of these two averages is the 
subject of ergodic theory; we shall show that it is plausible that these two 
are the same, although it has never been generally proved. In fact, we 
shall eventually adopt this equality as a basic hypothesis and trust to the 
comparison of our calculated results with experiment to justify our methods. 


20-2 Liouville’s theorem 


Let us consider those ensemble points which are initially in the small 
volume element AQ in I-space given by 


AQ = Ap, Ap,: ++ Ap y- Aq, Ag, -- : Aqy (20-5) 


As time goes on and the various representative points contained in AQ 
move about, the shape of AQ can change considerably in order to continue 
to surround the same points; we want to show that the volume nevertheless 
remains constant. 

We define a density D of points in I-space so that the number in AQ is 
D AQ, We shall show that D is constant, and for this purpose we need to 
obtain an equation of continuity applicable to I'-space. Let us consider a 
volume AQ’ which is fixed in ’-space. In a time dt, the gain in the number 
of points in AQ’ can be written 


(20-6) 


Gain = oe dt 


We can calculate this gain in another way by considering the flow through 
the surfaces bounding AQ’, In Fig. 20-2 we show the projection of AQ" 
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on the p,q, plane; if we construct a cylinder of length g, dt on the surface 
of AQ’, the “‘area’’ of the base will be Ag, --- Ag, Ap,--- Apy, and the 
“volume’’ will be g, dt Ag.:-:AqyAp,:-:Apy. Since all the points 
contained in this cylinder will have passed through the bounding surface of 
AQ’, we can say that the number into AQ’ through the surface “normal” 
to q; is 

(Di dt)g, Age + * Agy Apy* +: Apy (20-7) 


while the number of points which have passed out of AQ’ through the 
surface normal to q, is 
(Dq, At)a+aa Aqz:** Aqy Ap, ++: Apy (20-8) 


Therefore the net gain due to flow through the faces of AQ’ normal to q, is 
(20-7) minus (20-8), which becomes 


Gain throughq, = — = (Dq,) dt AQ’ (20-9) 
q1 


with the use of f(q, + Aq:) — f(qi) = (@f/0q1) Ag, and (20-5). We can 
discuss the flow of points through the surface normal to p, in the same way, 
and we shall find that 


Gain through p, = — a (Dp,) dt AQ’ (20-10) 
Pi 


There will be similar expressions for all the rest of the p’s and q’s; adding 
all of them, we obtain the following expression for the gain of points in 
AQ’; 


Gain = -—> FE (Dp;) + Ls (D4) dt AQ’ (20-11) 
7 LOp; 0q; 


If we equate (20-6) and (20-11) and cancel the common factor dt AQ’, we 
obtain the equation of continuity, 


aD. x 0 r) 
— (Dp. —(D = 0 20-12 
AF +2 2p, P;) + 3a, q;) ( ) 


In order to prove Liouville’s theorem, we differentiate the products in 
(20-12) and use the equations of motion (20-2); the general term of the sum 
in (20-12) then becomes 


2 2 
(2? 22 4s) + ( O*H_ , OH 


—_— —— ) (20-13) 
Op; 0q; 0q; Op; 
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The last term in parentheses in (20-13) vanishes because of the equality of 
mixed second partial derivatives, and (20-12) finally becomes 


0D (2? , oD. dD 
— + are j — j ee 0 20-14 
Slop) an) a ono 
so that 
D = const. (20-15) 


Therefore Liouville’s theorem (20-15) tells us that, if we follow a represen- 
tative point along its path in I’-space, the density of points in its immediate 
neighborhood will remain constant. 

There is another way of expressing Liouville’s theorem. Suppose we 
consider the volume AQ which moves along with one of the points as 
shown in Fig. 20-3. The boundaries of the two regions are defined by the 

same set of points whose paths are 
determined by Hamilton’s equations of 
motion. Then it is easy to show that 
Liouville’s theorem (20-15) leads to 


AQ _--- 
ZA AQ = const. (20-16) 


The number of points contained in 
AQ is D AQ by the definition of D; 
however, 

D AQ = const. (20-17) 


because the boundaries of the volume 
elements were defined by the same set of phase points so that the two vol- 
umes of Fig. 20-3 always contain the same points by their construction. 
If we combine (20-15) and (20-17), we immediately obtain (20-16) as a 
direct consequence of Liouville’s theorem. 


Fig. 20-3 


20-3 The basic hypotheses of statistical mechanics 


It is natural for us to interpret the density D as being proportional to the 
probability of finding a representative point in a given volume element of 
I’-space. If we do this, Liouville’s theorem also tells us: If the assumption 
that equal volume elements, AQ, and AQ,, enclose regions of equal 
probability is correct at a given time, then it is correct at all times. The 
reason is that, if the densities associated with the volume elements are D, 
and D,, then, if D, = D, at a given time, we shall have D, = D, for all 
times because of (20-15), and consequently the probabilities associated 
with AQ, and AQ, will remain equal if they ever were. 
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As a result, it is plausible for us to make the hypothesis that equal 
volume elements in I’-space are associated with equal probabilities, or, in 
other words: 


The probability of finding a system in a given range of states 
(that is, of finding a representative point of the ensemble in a 
given volume element) is proportional to the size of the 
corresponding volume element in I’-space. (20-18) 


This is the basic reason for our use of I’-space because it is only in this 
space that Liouville’s theorem holds. It is important to emphasize that 
Liouville’s theorem does not demand this basic assumption (20-18), but 
only shows that the laws of Hamiltonian mechanics are compatible with it. 
We now want to show in a qualitative way that Liouville’s theorem 
combined with (20-18) makes plausible our second basic hypothesis: 


The ensemble average at a given time is equivalent to the 
time average for a single system when used as the theoretical 
quantity to be compared with the corresponding experi- 
mental value of a given macroscopic property of thesystem. (20-19) 


Let us consider two systems A and B traveling along the same trajectory in 
I’-space. Thus the constants of motion are the same for the two, but they 
are displaced in time so that, for example, if p;, = f(t) then p;, = f(t — 7); 
hence the point B occupies the same points in I’-space as does A, but 
always a time 7 later—this general situation is illustrated in Fig. 20-4. 
At the time f) they are in the positions shown; B arrives at A’s original 
position at the time f) + 7, hence B spent the time 7 in AQ). Similarly, B 
will also spend the time 7 in the volume AQ, shown farther along the 
path at ¢,. But, from (20-16), AQ, = AQ), so that B spends equal times in 
equal volumes. 

Since the time spent by a system in a given volume element in I’-space is 
proportional to the size of the volume element, we see from (20-18) that 


AQ, 


AQ) 


to 
Fig. 20-4 
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the time spent is also proportional to the probability of finding a member of 
the ensemble in the volume element. Thus, if we were to calculate the 
time average of a given quantity by a formula of the form (16-9), we could 
replace the original integral over the time by an equal integral over the 
volume {2 of I’-space and hence by an ensemble average. In this way we 
see that our hypothesis (20-19) about the equivalence of the time and 
ensemble averages is consistent with our other basic hypothesis (20-18) 
about probability being proportional to volume in I’-space. 


20-4 The microcanonical ensemble 


We recall that a very important concept in thermodynamics was that of 
an isolated system—one which did not interact with its surroundings and 
hence was characterized by constant energy and volume; that is, 


U =const., V = const. (20-20) 


We want to devise an ensemble which is to represent an isolated system; 
such an ensemble is called a microcanonical ensemble. 

Since the energy is constant, the 2.V coordinates of a representative 
point are subject to the equation of constraint, 


A(py,-- +> Py G++ ->9y) = U = const. (20-21) 


In other words, the representative point is always on the surface in I’-space 
given by (20-21). (For the example of a one-dimensional oscillator, the 
corresponding surface is the ellipse of Fig. 20-15.) Therefore, when we 
visualize our microcanonical ensemble, the representative points can be 
distributed only on this surface and nowhere else; the density D must 
correspondingly be zero everywhere except on this surface. 

Such a representation as this is both undesirable and unrealistic. It is 
undesirable in that we have been considering only volume elements in 
I’-space up to now and the use of this surface would make all the volumes 
zero. It is unrealistic in that experimentally we cannot know the value of 
the energy of our isolated system with complete precision; a more 
accurate appraisal of our knowledge would be that we know the energy to 
have the value U within limits dU; that is, as far as we know the energy is 
somewhere in the range 


U-—dU<U<U+06U (20-22) 


Therefore the points of our ensemble should be distributed so that they 
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occupy the shell in I’-space between the two surfaces defined by H = 
U + ou. Since as far as we know all values of the energy in this range are 
equally likely, the density D should be constant in this shell and zero 
outside as we know enough to be able to say confidently that the energy of 
the system does not have a value corresponding to a point in I’-space 
outside the shell. 

Therefore our microcanonical ensemble is defined by giving the value of 
D for each point P of I’-space as 


D(P) = const, U—6U < UP)<U+ 06U (20-23) 
D(P) = 0, otherwise 
where U(P) is evaluated by means of (20-21). In this way we have obtained 
a statistical mechanical representation of an isolated system. 

Rather than continuing at present with a discussion of these very 
general systems, we shall now turn to a specialized type of system which is 
of great historical and practical importance. 


Exercise 


20-1. Assume the system to be a single particle of mass m moving vertically 
in a constant gravitational field. Draw a figure showing the surfaces in I’-space 
which bound the representative points of the microcanonical ensemble. Choose 
a small area between these surfaces, and show that points within this area at 
t = 0 are ina corresponding area at a later time 7. Show directly that these two 
areas are equal. 


21 Systems of independent subsystems 


We shall now restrict our considerations to systems which can be regarded 
as being composed of N independent identical subsystems, that is, where 
the interactions among these subsystems can be neglected. An example 
is an ideal gas consisting of N molecules whose mutual forces can be 
neglected. As a matter of fact, most of the applications we shall make of 
our results obtained in this chapter will be to systems of particles, such as 
gases, so that the subsystems will be the particles; accordingly, we shall 
frequently refer to our subsystems as particles for simplicity and definite- 
ness. However, it is important to keep in mind that our results are more 
generally applicable. 
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If the subsystem has / degrees of freedom, the total number of degrees of 
freedom is given by (20-1). If the Hamiltonian of the kth subsystem is 


A {py", ..., 91"), the Hamiltonian of the system is 
N 
H = 2 lpr, coe ,4) (21-1) 


where all terms in the sum are identical in form. There are no terms in 
(21-1) which involve the coordinates of two or more subsystems because of 
our assumption that they are independent. 

Since the subsystems move independently, ['-space can be divided into 
separate portions which are identical in nature and each of which is 
associated with one subsystem. In other words, we can imagine I’-space 
as “factored’”’ into a separate phase space for each subsystem. This 
separate phase space is called u-space; it has only 2/ dimensions—six for a 
monatomic gas, for example. The coordinates of a point in w-space are p,, 
Po --->91; the instantaneous state of a subsystem is given by the 2/ 
values of these quantities, that is, by a point in u-space. Since this can be 
done for every subsystem, we see that the instantaneous state of the 
system as a whole can be represented in two equivalent ways: by one 
point in I’-space or by N points in u-space. Similarly, an ensemble with 
M members can be represented by -Z points in I’-space or by N.Z@ points 
in w-space. 

If we imagine pu-space divided into volumes (‘cells’) of equal size 
AQ), where 


AQ, = Ap,::: Ap, Ag,:-: Aq, (21-2) 
the corresponding volume in I’-space, AQ, is given by 
AQ = (AQ,)* (21-3) 
Since we have associated equal probability with equal volumes AQ, by 
(20-18), we can conclude from (21-3) that we can also assume equal 
probabilities to be associated with volumes of equal size in u-space, that is, 
with the AQ. 


21-1 Probability of a state 


We assume that our system occupies a definite volume V and has a 
constant energy U; hence it is an isolated system and is described by the 
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microcanonical ensemble of (20-23). Let us now divide all of u-space into 
a total of M cells of equal volume AQ; we label the cells by an index 
PS 1g 26 on MM, 

A given state of the system can be described by giving the number of 
points (subsystems) in each of these cells, that is, by the set of numbers 
Ny, Ng, .. + Nyy ~-+,Ny, Where n, is the number in the ith cell and where 
uy; n, = N. This set of numbers x, is called a macrostate or a distribution. 
The individual coordinates are not specified for each particle; hence a 
macrostate represents a particular macroscopic state of the system in the 
same sense in which we used the distribution function in kinetic theory, 
as discussed in connection with Fig. 16-2. 

Since the particles are of the same type, we can interchange them at will; 
if we do this without changing the set 1,;, we shall obtain the same macro- 
state and hence the same macroscopic physical state. Every such arrange- 
ment of particles among and within the cells is called a microstate or 
complexion; thus in a microstate we visualize a definite specification of the 
particular cell in which each particle is located in u-space as contrasted to 
the mere counting of the number in a cell which is required to find a 
macrostate. Accordingly, we see that there are generally many microstates 
which correspond to the same macrostate. However, each microstate 
corresponds to a different volume element in [’-space although these 
volumes are all the same size, as given by (21-3). This statement is illus- 
trated in Fig. 21-1, which shows the projection of I'’-space on the q,'q,? 
plane; the corresponding coordinates of two microstates I and II which 
differ only in the interchange of particles 1 and 2 is also shown. The figure 
shows that these two microstates correspond to different regions of 
I’-space, although each has the same number of particles in the appropriate 
cells in “-space. 


Fig. 21-1 
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According to our basic hypothesis (20-18), the probability of a macro- 
state is proportional to the corresponding volume in I’-space; combining 
this statement with the preceding discussion, we have: 


Relative probability of a macrostate (1, Ng,..., Nj. -) 
= total corresponding volume in I’-space 
= (number of microstates) times (volume per microstate) 


= (number of possible combinations of N things taken 
Ny, Ng,...,N,,... at a time) times (AQ) 


es NO 
ny!ng!--°n,!:-+nm! 
so that 
N! N! 


W3 = FE 
nyingt:--n,t---nw! []n,! 
z 


(21-4) 


is proportional to the probability of a macrostate. The result (21-4) is the 
One originally obtained and used by Boltzmann. For identical subsystems, 
it is necessary to divide it by NV! and take 
w= ——_—___ (21-5) 
n!---n,t---na! 

as the thermodynamic probability; from now on we generally need use only 
W. The division by MN! is called “‘corrected Boltzmann counting”; there 
are many good reasons for this seemingly arbitrary step, and we shall 
discuss them as they subsequently arise. 

As a simple illustration of (21-5), suppose that N = 2 and M = 2. 
The possible macrostates with their corresponding probabilities are 


n=1, n=1 aT al 
nj =2, no =0 Ve et 
ee 210! 
iO. Wie - Weta 
_ ee ola 


This example also shows why W is called a relative probability, for it is 
not normalized and does not satisfy (16-10). The normalized probability w 
is obtained by dividing W by its sum over all possible macrostates so that 


pe es (21-6) 


macrostates (n;) 
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if we use (n,) as a symbol for a given macrostate. w as defined in this way 
satisfies (16-10). 

The distribution (”,;) is not completely arbitrary but must satisfy the 
conditions that the number of particles and the energy remain constant; 
that is, 

> n, = N = const. (21-7) 


> ne; = U = const. (21-8) 


In (21-8), e, is the energy corresponding to a particle in the ith cell with 
coordinates p;, Po .--»91; and is found by evaluating the particle 
Hamiltonian [any of the H,, of (21-1)] for these values of the p’s and q’s. 

Our basic problem has now been solved in principle, for, if G is any 
system property of interest to us, its average value is 


> GW 
(n;) 
= =_- -9 
G= > Gw S (21-9) 


For example, the average number of particles in the rth cell is 


> 1,W 
= (ni) 
A, 7 (21-10) 
(n;) 
In both of these formulas, the sums are to be taken over all possible 
macrostates, that is, those which satisfy the conditions (21-7) and (21-8). 
These averages can be calculated by methods developed by Darwin and 
Fowler, but the mathematics involved is rather formidable. The results 
obtained in this way are the same as those obtained by Boltzmann in his 
original approach, which we shall follow. 

Boltzmann did not try to evaluate average values but instead found 
which macrostate had maximum probability and regarded this state as the 
equilibrium state. The calculated properties of the maximum probability 
macrostate were then to be compared with experiment. This method is 
based on the very plausible argument that, since the equilibrium state is 
what we are virtually certain of observing, its probability of occurrence 
must be overwhelmingly large and therefore a maximum. However, this 
method is different from our original aim of finding averages, and we have 
already seen an example of the fact that the most probable value is generally 
different from the average value in the specific case (18-19). Nevertheless, 
we Shall proceed to calculate most probable values, and later we shall 
show by various methods that the difference between the most probable 
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and the average values of a quantity are completely insignificant for 
systems of interest in statistical mechanics; the basic reason for this is the 
very large number of particles involved. 

We can write W in a more convenient form, however, and, in fact, it is 
more useful to consider In W; from (21-5), we find 


InW= —} Inn! (21-11) 
It is customary to use the Stirling approximation for the factorial: 
ni =~ (?) (21-12) 
e 
so that 
Inn! @nin= =nInn—n (21-13) 
e 


Substituting (21-13) into (21-11) and using (21-7), we find that our approxi- 
mation is 
InW= —S vn, Inn, +N (21-14) 


21-2 Lagrange’s method of multipliers 


In order to maximize In W, we cannot simply differentiate (21-14) with 
respect to each of the n, in turn and set each derivative equal to zero 
because the n, are not independent. The n, are subject to the auxiliary 
conditions (21-7) and (21-8) and cannot be arbitrarily varied. A way for 
solving such a problem was devised by Lagrange and is known as the 
method of (undetermined) multipliers. 

Suppose we wish to find the values of the variables x,, 2,, ... , 2, which 
correspond to an extreme value of the function (2, %2,...,2 x). Wealso 
assume that the variables are not all independent but are subject to L 
equations of constraint of the general form 


P(X, To, ...,X KR) = const. (J = 1,.2,.08254) (21-15) 


where L < K; the number of independent variables is therefore K — L. 
In principle, we could solve each of the equations (21-15) for a given 
variable and substitute the result into p so that y would finally be expressed 
as a function of independent variables only. This procedure is not 
always practicable, however, and we can proceed differently. 

If we imagine virtual variations 62, in the variables from their values 
that correspond to the extreme value of y, the condition for the extreme 
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value is that the corresponding first order variation dy vanish so that 


= 0 
dbp => — bx, =0 (21-16) 
i=1 On, 
We cannot set the coefficient of each dz; separately equal to zero because 
the x, are not independent. Similarly, we can differentiate the constraint 


equations (21-15) and obtain the L equations 
Od, = 96; 5 6x, = 0 (21-17) 
t=1 Ox, 


Let us multiply each of the equations (21-17) by an arbitrary parameter /,, 
sum the results, and obtain 


4, A557 a 5x, = 0 (21-18) 
j=1i=1 
If we add (21-18) to (21-16), we find that 
L 
by + 54, 66, => (= +34, 21) b2,=0 (21-19) 
j=1 


i=1 \Oz;, j=1  O2, 


Because the A, are arbitrary, we can choose them to have any value which 
suits our purposes; therefore we shall require that the A, be chosen so 
that the coefficients of the first L of the 6z, in (21-19) are zero; that is, 


oP 4 32, gee (i=1,2,...,L) (21-20) 
Ox; Ox, 


These are L equations which can be solved for the L unknown 4/,;; hence 
they can now be considered as known functions of the z,, that is, 


A, =Alay,%,...,t%%) (G=1,2,...,D (21-21) 


Equation (21-19) now becomes 
K 
pa (5 F YA, =) dx, = 0 (21-22) 
i<E+1 \Ox, j=1 Ox, 


Since we have so far not decided which of the x; we are going to consider 
to be the independent variables and which the dependent ones, we can 
now choose the first L of the z; to be the dependent variables and the 
remaining K — Lofthez, to beindependent. Wecan solve the L equations 
(21-21) for the first L of the z; to obtain 


B, = 2%, ees BKvAy---yAp) (6 =1,2,...,L) (21-23) 
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If these equations are now substituted into (21-22), the resultant equation 
will involve only the independent variables z,,,,..., £x. Therefore the 
variations 6x, in (21-22) are independently arbitrary, and the only way in 
which (21-22) can always be zero is for the coefficients of the dz, to be zero 
separately; in this way we obtain the K — L equations 


L 
CAA yy es ery (i= L+1,...,K) (21-24) 
Ox, i=1 On; 
These resultant equations when combined with (21-20) as previously 
chosen give us a total of K equations as the conditions for the extreme 
value of y subject to the conditions (21-15). 
All these equations are seen to be of exactly the same form and, accord- 
ing to (21-19), are exactly what we would have obtained if we had used 


dp + 51,66, =0 (21-25) 


and treated alt the z,; as if they were independent. In other words, if we 
look for the extreme value of the function y + 2, 2,4; by treating the 
variables as if they were all independent, the results are the same as those 
which will give the extreme value of wy alone subject to the constraints ¢,. 


21-3 State of maximum probability 
In order to maximize In W, we consider virtual variations 6n, in the 
number of particles in the cells. The variation in the conditions (21-7) and 
(21-8) yield 
> on, = 0 (21-26) 
> ¢; on; = 0 (21-27) 
If we differentiate (21-14) and use (21-26), we find that 
dlInW=0=-) (n,n n, +n, os) 
i \ Nn; 
= —)> 6n, Inn, (21-28) 
Following (21-25), we multiply (21-26) and (21-27) by « and #, respectively, 
and subtract the results from (21-28) to find that the condition for the 


maximum is 
> (Inn, + « + fe,) on, = 0 
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Since the dn; can now be treated as independent, we equate the coefficients 
separately to zero so that 
Inn, tat Be, =0 (21-29) 
and therefore the set 
a (21-30) 


is the macrostate of maximum probability W,,. 
If we substitute (21-29) into (21-14) and use (21-7) and (21-8), we see 
that | 
InW, =«N+ BU+N (21-31) 


The multipliers « and 6 can be determined from the constraints. If we 
substitute (21-30) into (21-7), we find 


et= or (21-32) 
so that (21-30) can also be written - 
= : — (21-33) 
When we substitute (21-33) into (21-8), we obtain 
N> ee Pe 
U =— (21-34) 


e Fe: 
2 


Thus, in principle, 8 can be found from this result as soon as the depend- 
ence of e, on the location of the cell is known; we shall soon find an 
easier and better way of determining /. 

It is convenient to define the partition function Z, by 


Zv= p2 e Pei (21-35) 
We see from (21-35) and (21-34) that 
dln Zy _ sie ge PY _U 
op Zo i N 
and therefore 
= —y 21nZe (21-36) 
op 
Similarly, if we use (21-35) and (21-33), we obtain 
n = —Na@inZo (21-37) 
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The fundamental results for the macrostate of maximum probability are 
summarized in the last three numbered equations. 

Before we go on, it will be useful to see how the probability of any 
other distribution compares with that of maximum probability. We let 7,’ 
be a macrostate of probability W’ given by (21-14) as 


InW’ = —)n/ Inn’ +N (21-38) 
while n, corresponds to W,, so that 

In W,, = —d nj inn, + N 
Subtracting the last two equations, we obtain 


In a = —)> (n,' Inn,’ — n,n n,) (21-39) 


m 


We now set 
n,; =n, + An, (21-40) 
The differences An, need not be small in absolute value, but we shall 
assume that |An,/n,| « 1 so that the primed macrostate can be thought of 
as “near” that of maximum probability. Since we must always have 
N = 2, n, = 4, n,', we find that 


> An, = 0 (21-41) 


Using (21-40), and In(1 + z) = 2 — 32? + --- for z « 1, we obtain 
n,; Inn,’ = (n, + An,) In (n,; + An,) 
= (n, + An){In n; + In [ + “ns 


a 


2 
~nanfina (3) ~4(5) 


n; n; 


(An,)° 


=n,Inn, + An, + An, Inn, + (21-42) 


which is correct to terms of order (An,/n,)?. Substituting (21-42) into 
(21-39), using (21-41) and the condition (21-28) that n, correspond to W,,, 
we find that 


In— = => Ano 
m ¢ 2n; 
and therefore 
wr _ on EMAn a" /ny (21-43) 
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In order to get a numerical example, let us assume a special situation in 
which the fractional change in each cell is the same: 


|An,/n;| = const. = |An/n| (21-44) 
If we substitute (21-44) into (21-43) and use (21-7), we obtain 


Ww = oe N(An/n)? (21-45) 
Wm 
Suppose we consider a kilomole of gas for which N = 10?’ and also 
choose |An/n| ~ 10-8, that is, a fractional difference between the two 
states of only one part in a hundred million. Substituting these numbers 
into (21-45), we obtain 
Ww ext 9 
Wn 
Since this ratio is so very small, we can confidently say that the proba- 
bility of any other state is completely insignificant compared to that 
of maximum probability. In other words, all distributions which would 
have a significant physical difference for us have a negligible probability 
compared to the maximum probability distribution provided that N is 
sufficiently large. Consequently, only the values associated with the 
maximum probability state will contribute appreciably to the sum for the 
average in (21-9) and hence the average of any quantity will be negligibly 
different from its value of maximum probability. Thus we see some 
quantitative justification for Boltzmann’s method of procedure. 


21-4 Entropy and probability 


The next thermodynamic parameter we wish to identify is the entropy S. 
Boltzmann made the fundamental suggestion that the entropy can be 
written as a function of the probability of the equilibrium state, that is, 


S = S(W,,) (21-46) 


In order to find the form of this function we use the basic properties of the 
two quantities involved. By the second law of thermodynamics, S is a 
maximum for an isolated system at equilibrium for which we assume 
maximum probability W,,. Entropy is an extensive quantity and therefore 
additive; if we combine two systems to form a single system, the entropy 
S,, of the combined system will be 


Sie = S; + Se (21-47) 
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The probability, on the other hand, is multiplicative and thus 
W.. = WW, (21-48) 
Combining the last three equations, we obtain 
Sie Wie) = Syo(W1 We) = SYW,) + SW.) (21-49) 


If we differentiate (21-49) with respect to W, and W,, we find the two 
equations 
OSi2 _ dS32 OW,. = W, dS. aa dS; (21-50a) 
OW, dW, OW, dW,, dw, 


OSry _ Sig Wis _ yy Ain _ ASa (21-50b) 
OW, dW,, OW, dW,, dw, 
By multiplying these by W, and W,, respectively, and by using (21-48) and 
the same arguments which led to (18-7) from (18-6), we obtain 


ASin _ eee — w dS. 


dW, ‘dW, “dW, 


which, when we drop the numerical subscripts and note (21-46), leads to 
the general result 


= const. = k 


Wie 


isa = ain Ww 
Wn 
and therefore to 
S =k In W,, (21-51) 


after we drop the constant of integration. Equation (21-51) is Boltzmann’s 
famous result. Later, we shall show that & is actually the Boltzmann 
constant defined in (17-14’). 

We can express S in terms of other variables by substituting (21-31) 
into (21-51). The multiplier « as evaluated from (21-32) and (21-35) is 


a=InZ,—InN (21-52) 


and, if we also use (21-13), we obtain the various forms 


S = BkU + Nk In Z) + Nk — NkInN (21-53a) 
= BkU + Nk In Z) + Nk In (21-53b) 
= BkU + NkInZ,—kinN! (21-53c) 


AY 


= BkU + kin _ (21-53d) 
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Of course, nothing we have done so far proves that the quantity S has 
the same properties as the thermodynamic entropy which is defined by 
(10-27). In order to do this, we must first find how the work done in a 
small change of state can be described. 

In general, the energy e, of a subsystem in the ith cell depends not only 
on the coordinates of the cell but also on other external parameters a, 
which appear in the Hamiltonian. The most common of these parameters 
are the volume and the electric and magnetic field components. Thus one 
has 


E; = E( Pry - = 694s G1,- ++, AQ,--- ) (21-54) 


If the parameter a, is changed by the amount da,, the energy of a 
subsystem in the ith cell will be changed by 


de, = oa, 
Oa, 
so that, if we sum this over all the cells, we find the total change in energy 
of the system resulting from a change in the parameter to be given by 


O&, 
ot 


> n, de; = da, (21-55) 


a da, 
which also equals the work done on the whole system in order to change a,. 
There will be expressions similar to (21-55) for each parameter that is 
altered; if we sum these, we get the total work dW done on the system 
resulting from a general change of state; therefore 


Wy aes ada, (21-56) 
A i da, A 
where 
OE; 
Pe = > Nn; 

z da, 
is defined as the generalized force acting on the system. If a, = V, for 

example, then F, = —p, and dW = —p dV in agreement with (8-6). 


We can also express the generalized forces as derivatives of the partition 
function. If we use (21-35), (21-33), and (21-57), we find 


(21-57) 


A In Zo _ ~ By ge, eee Se 
da, Zo i da, N= ‘Oa, 

and therefore 

_NdinZ, 


F,= 
, B da, 


(21-58) 
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If we now calculate dS from (21-53a) with the use of (21-35), (21-36), 
(21-58), and (21-56), we find 


dS = Bk dU + kU dB + NkdlnZ, 


dln Z, 
0a, 


= Pk aU + kU dp + Nk ap + Nk das 
a 


= Bk (dU — > F, da,) 
A 


= Bk (dU — dW) (21-59) 


This agrees with the defining equations (10-27) and (9-1) for the thermo- 
dynamic entropy provided that fk = 1/T. Therefore, if we adopt 


! 
p= (21-60) 


as a universal relation, then S = k In W,, is completely identified with the 
thermodynamic entropy and, for example, (21-535) becomes 


U e 
S =—+ NkInZ, + Nk In— 21-61 
r 0 rT ( ) 


(We note that we have not yet identified k.) 
We can now easily find the Helmholtz function from (11-46) and (21-61); 
the result is 


F = —NkTInZ, — NkT In — (21-62) 


We can use our thermodynamic result (11-65) to obtain a formula for the 


equation of state: 
p= - (=) = ner (2in2 2 Zs) (21-63) 
OV /r OV /r 


This result agrees, of course, with the general formula (21-58) for the 
generalized forces and again demonstrates how a knowledge of the 
partition function Zp is sufficient to determine all the thermodynamic 
properties of a system composed of independent subsystems. 


Exercises 


21-1. Discuss and compare the I’-space and s-space for a system of N inde- 
pendent one-dimensional harmonic oscillators. 

21-2. Show that the expression (21-4) for Wg is correct. 

21-3. Show that the entropy can be written 


S = —Nk In p; + Nk In(e/N) 
where p; is the probability that a subsystem will be in the ith cell of se-space. 


Part Four. Statistical Mechanics 193 


21-4. If the subsystems are actually distinguishable (as was implied by our 
labeling them in our derivation of W,) rather than completely identical (and 
accordingly indistinguishable in principle), show that one should use the un- 
corrected Boltzmann probability W,, rather than (21-5). Also show that then 
(21-36) and (21-37) are unchanged, while the entropy and Helmholtz function 
are given by 

S = (U/T) + Nk InZ, (21-64) 


= —NkTInZ, (21-65) 


rather than by (21-61) and (21-62). An example of distinguishable subsystems 
would be the /ocalizable atoms of a solid which are fixed at their average 
positions. What other examples can you think of? 


22 Ideal gases 


Let us assume our system to be a gas containing N molecules. We let 
L,Y, 2, Pz» Py» P, be the coordinates and momenta of the center of mass of a 
molecule; we let them be the q,, G2, 93, P1s Po, P3 and let any remaining 
degrees of freedom of the molecule be described by gy, ... 1, Pas. - - » Pr- 
The Hamiltonian of the molecule can then be written as the sum of the 
energy associated with the motion of the center of mass and a term H’ 
which includes the rest of the energy; therefore 


l 
H = am (Pz ate Py a5 DP. ) + f(z, y,z) + H"(py,...,Q) =E (22-1) 


where (x, y, 2) is the external potential energy which depends on the 
position of the center of mass. 


22-1 The Maxwell-Boltzmann distribution 


We shall use our results to answer the question: How many molecules 
have x, y,..., p, in the range dx dy dz dp, dp, dp, regardless of the values 
of the rest of the coordinates and momenta? 

The number of molecules in the cell of volume 


dQ’, = dx dy dz dp, dp, dp, dq,::- dp, (22-2) 


in u-space is given by (21-33). If we multiply (21-33) by dQ,,/dQ, = 1, we 
can change the sum over cells in u-space to the 2/-fold integral over all 
values of the p’s and q’s since both procedures cover the total volume in 
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u-space; if we also use (22-1) and (22-2), we find 
Ne*i — Ne dQ, 

per fe P*dQ, 


n, = 


_ NeW BP /2m—-BO-BH' yy... dp, dq,*** dp, 
fer /2m-Bo gy -++dp,feF" dq,--- dp, 


We actually want only the number with coordinates in the range 
dx --- dp, which we shall call n,,,. We can obtain 7,,, from (22-3) by 
integrating over all permissible values of g;,..., P.; when we do this, the 
integral involving H’ will appear in both numerator and denominator and 
can be cancelled with the result that 


(22-3) 


Nom = const.e~F? /2™-% da dy dz dp, dp, dp, (22-4) 


If we express this result in terms of the molecular velocity u = p/m, the 
number v,,, can be written in terms of the molecular distribution function 
f defined in (16-15); then (22-4) becomes 


fdx++-+du, = const.e~#™"-¢ dx dy dz du, du, du, (22-5) 


This distribution function is known as the Maxwell-Boltzmann distribution. 

We can obtain the fraction of the molecules whose velocity components 
fall in the range du, du, du, regardless of position by summing (22-5) over 
the coordinates x, y, z; the result is 


F(u) du, du, du, = const.e-'°"™" du, du, du, (22-6) 


On comparing with (18-14), we see that (22-6) is identical with the Maxwell 
velocity distribution function provided that B = 1/kT; this agrees with 
(21-60) and proves that the & introduced in (21-51) is actually the Boltz- 
mann constant R/L. Our derivation of (22-6) has been very general and 
has used only our basic ideas about equilibrium. We also see that the 
Maxwell velocity distribution is not restricted to point molecules, or to 
hard sphere molecules, but is applicable to molecules of any complexity 
as long as they can be assumed to be independent. 
We can obtain the density n by using (22-5) in (16-16); the result is 


n = const.e *? 
and, if we let 7) be the density where ¢ = 0, we obtain 
n(x, y, Z) = nge FO?) = nye F/*T (22-7) 


The result tells us how the density of a gas in equilibrium varies with 
position and shows that the general tendency is for the density to be 


Part Four. Statistical Mechanics 195 


greater at locations where the potential energy is smaller. The factor 
e~*/KT is sometimes called the Boltzmann factor; the result (22-7) is often 
called the /aw of atmospheres since it would describe the density variation 
in an isothermal planetary atmosphere if ¢ is set equal to the gravitational 
potential energy. We shall discuss several of the many other uses of (22-7) 
as we proceed. 


22-2 Ideal monatomic gas 


We want to calculate all the thermodynamic properties of the gas when 
the molecules are assumed to be mass points. Each molecule thus has 
three degrees of freedom and (22-1) becomes 


1 
ae (p,” 5 Py” — P,) + p (22-8) 
2m 


Although we shall neglect other external forces, we require the potential 
energy ¢ to describe our assumption that the gas is kept in a container 
of volume V while otherwise free to move anywhere within the container. 
Thus we let ¢ = 0 inside the container, and ¢— oo at the walls and 
everywhere outside the container; as a result, e~** = 0 for all points 
outside the container and we need consider in our calculations only those 
values of x, y, z which are inside. 

We need to find the partition function Z). If we let 4? represent the 
volume of a cell in u-space, we can multiply the sum in (21-35) by dQ, /h? 
and convert the sum to an integral. If we also change to spherical co- 
ordinates in momentum space and use (18-9), we obtain 


Zo — > e Pei 
2 


= {{{ eee dx dy dz dD. dp, dp, 
h? 


volume 


V 
he {| e Mae tovisee lim dp, dp, dP 


27 fT fo 2 
-4 | im e FP im >? sin 6 dp dé dy 
0 J0J0 


_ anv 
h® Jo 


P 4% 
e fp (2m 52 dp _ = ea, (22-9) 
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which can also be put in the form 


7 = > (2nmkT)* (22-10) 


with the use of (21-60). 
From (22-9) we find that 


ines" ne en pain (22-11a) 
2 2 ke 
nee hrs nee nee (22-11b) 
2 2 «Pe 
If we apply (21-36) to (22-11a), we obtain 
pee ee ir (22-12) 
2B (2 


which is equivalent to the result (17-16) we previously obtained for an 
ideal monatomic gas by kinetic theory methods and again identifies k as 
the Boltzmann constant. 

The entropy is obtained by inserting (22-115) and (22-12) into (21-61) 
and is 


4 54 
SS =2NkIn T+ NkinV+ Nkin [orm (22-13) 
2 Nh 
Since Nk = (vL)(R/L) = vR, we can also write (22-13) as 
S= C,In T+ vRIn V + const. (22-14) 


if we use (17-17). We see that (22-14) is exactly the same as the expression 
(9-43) we first found for the entropy of an ideal gas by purely thermo- 
dynamic means. 

The equation of state as found from (21-63) and (22-11) is 


Pp 7 (22-15) 
and is the equation of state of an ideal gas (17-15) which we have succeeded 
in obtaining by the general methods of statistical mechanics. If we 
review our calculation of Z), we see that the volume V will enter into Z, in 
the same way as in (22-10), no matter how complicated the molecules 
might be, as long as they can be treated as independent. As a result, we 
shall always obtain (22-15); thus we also find that the same equation of 
state applies to all ideal gases and is not restricted to those which are 
monatomic. 
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22-3 The absolute value of the entropy 


If we rewrite (22-13) so that it involves only one logarithm, we obtain 
the Sackur-Tetrode formula 
34 
S = NkIn (x) Gomis (22-16) 
N h* 
Since the ratio (V/N) is intensive, S ~ N and therefore our entropy 
expression is extensive, as it should be. 

The extensive property of S is a consequence of our use of corrected 
Boltzmann counting, that is, (21-5) rather than (21-4). We see from (21-4) 
and (21-51) that the use of uncorrected Boltzmann counting is equivalent 
to adding 

kin N! = Nk In S 
e 


to (22-16), so that the entropy formula would be 


V(2amkT)* e* 
h3 

This result is not extensive because only V appears under the logarithm. 
Thus we have found one of the reasons for the use of corrected Boltzmann 
counting for identical subsystems; we shall defer a more complete dis- 
cussion until later. 

We have also obtained the absolute value of the entropy; we could not 
do this in thermodynamics. If we apply (22-13) to a kilomole and use 
(17-17) again, we find that 


s=c,InT+ Rinv + 3, 


S = Nk In 


which is exactly the same as (15-16) except that now we have found that 
(27mk)”* e” 
Lh* 


If we combine (15-18) and (22-17), we find the entropy constant 5,5, which 
appears in the vapor pressure equation (15-23), to be given by 


Sy) = Rin (22-17) 


3 54 
so = Rin (27m) (ek) 
h? 


These results show that the entropy involves the size A? of the cells in 
i-space which we have not yet specified. This fact did not disturb Boltz- 
mann because the entropy as originally defined always involved an additive 


(22-18) 
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constant and he was free to choose his cell size in any arbitrary way. The 
third law of thermodynamics, however, requires that the entropy have a 
definite value, and this means that the cell size 4? can no longer be chosen 
arbitrarily but must be fixed to agree with experimental results for the 
entropy. For example, from a study of the vapor pressure curve, one can 
obtain an experimental value for s,, and then find / from the theoretical 
expression (22-18). It is found that, within experimental error, 4 is equal to 
Planck’s constant (6.625 x 10~*4 joule-second) which was introduced by 
Planck at the beginning of quantum theory. 

Thus there is a natural size for the cells in u-space; for the monatomic 
ideal gas it equals h?. This is in agreement with the general results of 
quantum theory which show that we cannot know both a momentum and 
its conjugate coordinate exactly, but only to within the “uncertainties” 
Ap, and Ag, which are limited according to the uncertainty principle by 


Ap, Aq, © h (22-19) 


Accordingly, for a monatomic gas we would have 
AQ, = Ap, Ax Ap, Ay Ap, Az » h® 


which agrees with what we found above. In general, then, if a system has / 
degrees of freedom, the cell size in u-space must be chosen to be 


AQ, =h' (22-20) 


Although (22-20) is correct, its real justification can only be given by 
means of the complete formalism of quantum mechanics. 


22-4 Mixtures of gases 


Before we go on, we should make certain that our methods will describe 
equilibrium if there is more than one type of molecule, as would be the 
case for a mixture of gases. We shall discuss this problem only enough to 
see that we are getting reasonable results. 

For simplicity, we restrict ourselves to the case of only two gases 
occupying the same volume V and possessing a total energy U. The 
masses of the molecules are m, and m,, and their numbers are N, and Ng. 
We define a w-space for each gas and let the numbers of molecules per cell 
in each u-space be denoted by n‘}) and n®. 

The probabilities for the two gases as obtained from (21-5) are 


I I 


ag eee Ww, = —————_——_— (22-21) 
1 5) 2 
nO ni! ose ni?! no oe 
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and the probability for the mixture is W = W,W,. Using (22-21) and 
(21-13), we obtain 


InW=N, + No— dn? Inn? — > nP Inn? (22-22) 
i j 


We want to find the maximum of In W subject to the conditions of constant 
number of particles and energy, namely, 


p2 ni) —_ Ny, b3 n'?) = No, 
(1), a (2), (2) — 
D2 ns + x n; 


Proceeding as in Sec. 21-3, we introduce three Lagrange multipliers «, ., 
B corresponding to these three constraints, and we find the condition for 
the maximum to be 


> on Ein n® + a, + Bel] + > n [In n® + a, + Be] = 0 


which leads to 


(1) 


(1) 
—a,—Be; (2) 
n; Sy ON; 


= = goa2—Bes” (22-23) 


If we had done these calculations before we had identified B as 1/kKT, the 
results (22-23) would have shown us that B must somehow be connected 
with the temperature. If the two gases are in mutual equilibrium, they 
must have the same temperature and f is the only quantity in (22-23) 
which is the same for the two distributions and hence could possibly 
describe what the gases have in common. 

If we calculate the partition functions as in (22-9), we find 


V 3 V 3 
so that the Helmholtz functions as found from (21-62) are 
F, — —N,kT |n V + Fr, F, = —N,kT In V + Foo (22-24) 


where Fj) and Fy are independent of V. The total Helmholtz function of 
the system therefore is 


F=F,+ Fy = —(N, + N)kT In V + Fy + Fao (22-25) 
The pressure is found from (21-63) and (22-25) to be 


— Se AE eT <a“ Pi + Po (22-26) 
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We also see that p, = —(0F,/0V)7 and p. = —(0F,/0V)>, so that p, and 
P2 are the partial pressures of the constituents of the mixture. Therefore 
we have derived Dalton’s law of partial pressures (8-22) by our statistical 
mechanical methods. 


Exercises 


22-1. Find expressions for the multiplicative constants of (22-4), (22-5), and 
(22-6), 

22-2. Find the density as a function of position for a gas comprised of N 
molecules contained in a cylinder of radius a and length / which is rotating 
about its axis with angular velocity w. 

22-3. Suppose that the particles of an ideal gas have the relativistic dependence 
of energy on momentum 

© = c(p,2 + py? + pe + myc?) (22-27) 


as found from (5-24). Find the energy, Helmholtz function, and equation of 
state for this ideal gas in the extreme relativistic range where moc « p,, and 
compare the results with (22-12) and (22-15). 


23 Equations of state of real gases 


There are forces between the molecules of real gases; hence these mole- 
cules cannot be regarded as completely independent particles and the 
methods we have developed are not directly applicable to them. If we 
restrict ourselves to situations where the gas does not deviate too much 
from ideal behavior, we can hope to develop approximation methods 
which will enable us to calculate the corrections to our results for ideal 
gases which arise from the intermolecular forces. In this chapter we shall 
show how the Boltzmann factor e~**? can be used to calculate the 
equation of state and to verify more accurately some of the remarks we 
made about the van der Waals gas in Chapter 12. 


23-1 Virial coefficients and the general equation of state 


If we write van der Waals’ equation of state (12-6) in terms of the 
number of molecules N by using N = vL, we obtain 


a'N? 
(n+ 24 


\w- Nb’) = NkT (23-1) 
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where 


ee 


a’ = (23-2) 


In order to illustrate the motivation for our later calculations let us 
expand (23-1) to first order in small quantities; if we multiply out the 
left side and neglect the second order term —a’'b'N%/V?, we obtain 


*nT2 
pV — b’Np + a ~ NkT (23-3) 
so that, to first order, 
NkT 
~~ — 23-4 
p 7 (23-4) 


Since the correction terms to (23-4) are of first order, we can replace p in 
the second term of (23-3) by (23-4) and still get a result which differs from 
(23-3) only by second order terms which we are neglecting anyhow. 
When we do this, we obtain 


ay2 

pV — vn (~T) a (ex) ~ NkT 
V OV 

which becomes 


27s 5 ’ 
(Va Ne es 


(23-5) 
The terms we have neglected would appear on the right side of (23-5) as 
inversely proportional to V?. 

It is clear from this example that we can expand the equation of state 
for any gas in the general form 


pV= Nery 2D gS 
V 4 


ae (23-6) 
since this form approaches the ideal gas law for large V, as do those 
equations found empirically. The coefficients in this expansion are called 
virial coefficients; NkT is the first virial coefficient, B(7) is the second 
virial coefficient, etc. The more virial coefficients we can obtain, the more 
terms we shall have in the expansion; hence the equation of state (23-6) 
could be expected to be more accurate for small V. 

We see from (23-5) and (23-6) that the second virial coefficient of a van 
der Waals gas is 


B(T) = NXb'kT — a’) (23-7) 
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Fig. 23-1 


and is a linear function of T as shown by the straight line of Fig. 23-1. 
Experimental values of B(T) for real gases yield a curve with a slight 
curvature which is like the dashed curve in the same figure. Thus the van 
der Waals equation predicts a second virial coefficient which is remarkably 
like those observed. 

We shall see how we can calculate the second virial coefficient for a 
general real gas and thus find the first correction term to the ideal gas 
equation of state. In order to do this, we must first discuss a theorem of 
classical mechanics. 


23-2 The virial theorem 


This theorem is quite different from most theorems in mechanics 
because it is statistical in nature; that is, it deals with time averages of 
certain mechanical quantities. 

Let us consider a general system composed of mass points whose 
position vectors arer,; the force on the ith particle is F,; and may include 
forces of constraint. The equations of motion of the system are 


_ Pi _ ah 


i i 23-8 
dt dt® ee) 


We define a quantity 


G= pr, (23-9) 
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If we calculate dG/dt from (23-9) and use (23-8), we find that 


dG dr; dp, 
—_ = Soe + —-fF, 
a ant a 
= JE. + 2 F, © r; (23-10) 
where 
1 (2) 
E =n mA 
<9 2 ‘\ dt 


is the total kinetic energy of the system. 
If we calculate the time average of (23-10) over an interval 7, we obtain 


_ G(r) — G0) 


dG _1{'dG = 

a *{ a d= Ett K, r; (23-11) 
If the motion of the system is periodic, we can choose 7 to be the period 
and the right side of (23-11) will be zero. If the motion is not periodic, 
but is bounded so that the coordinates and momenta remain finite, then 
we see from (23-9) that G will never become greater than some finite 
upper bound; in this case, we can make the right side of (23-11) vanish by 
choosing 7 sufficiently large. Thus we can conclude in general that 


dG/dt = 0 and therefore 


E,=—34)> F,-r, (23-12) 


This is the virial theorem; the right side is called the virial of Clausius. 

The virial theorem deals with time averages which are basically what we 
are looking for in statistical mechanics. However, we have decided that 
time averages and ensemble averages are equivalent for our purposes; 
thus we can also interpret (23-12) as expressing a relation between en- 
semble averages, or as a relation involving both time and ensemble 
averages. 


23-3 Contribution of the pressure 
We have seen in (17-14) and (22-12) that E, = 3 NkT for monatomic 
gases so that (23-12) becomes 
NkT= —}4) F,-r,; (23-13) 


a 


We shall neglect all external body forces such as gravity; the only forces 
on a molecule which we need consider then are the force exerted by the 
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wall of the container, F,,,, and that due to the other molecules, F,’. Thus 
we can write F, = F,,, + F, and (23-13) becomes 


NkT= —3> Fw: —3 DF +r, (23-14) 


We consider first the contribution to NkT from the wall forces. Since 
the wall will act on a molecule only when it is very near the wall, it will be 
a good approximation to replace r, by the position vector r of the element 
of area of the wall da which the molecule is near. Since r is constant, the 
first sum in (23-14) will now involve only F,,,; however, the average force 
exerted by the wall on the gas near the area da is —p da since the direction 
of da is along the outer normal. In other words, we have 


F,,, °T; = —pda-r 


molecules at da 


so that the sum over all the molecules can be replaced by an integration 
over the total bounding surface S of the volume V. If we also use the 
divergence theorem (I: 1-33), the first sum in (23-14) can be evaluated as 
follows: 


> Fry ° 1; = | (—pr- aa) a —p| r. da 
i Ss Ss 
= -p| divrdV= -3p| dV = —3pV (23-15) 
V V 
If we insert (23-15) into (23-14), we finally obtain 


pV=NkT+43F/ +1; (23-16) 


which includes all forces except those exerted by the wall. If F,’ = 0, we 
find that pV = NkT, as we would expect. On comparing (23-16) and 
(23-6), we see that we should be able to calculate the virial coefficients 
from a knowledge of the intermolecular forces F,’. 


23-4 Contribution from intermolecular forces 
For simplicity, we shall consider only the case of central forces so that, 


ifr is the separation of two molecules, the force between them is f(r). The 
force is repulsive if fis positive, and attractive if fis negative. 
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Consider the pair consisting of the jth and kth molecules as shown in 
Fig. 23-2. Since r,, =r, —r,; and F; = —F,’ by (I: 3-10), we obtain 


, , , , 
fein r,=F,-r,;+ F,-% = Fe tg HSC win 


so that, for central forces, (23-16) becomes 


pV = NkT+34 > ef(r) (23-17) 
pairs 
The number of pairs that can be obtained from N molecules is 
4N(N — 1) ~ 4N?; hence the sum in (23-17) can also be written in terms 
of the result for a typical pair as 


py= NkT + ENP (pair (23-18) 


If we let ¢(r) be the potential energy for 
the forces, we have 


fry=-Z, go 319) 
where the potential energy is chosen to be 
zero when the molecules are a great dis- 
tance apart. The dependence of ¢ on r is 
generally assumed to be similar to that Fig. 23-2 
shown in Fig. 23-3. The steep rise in ¢ 
for small values of r describes a repulsive force which is effective when 
the molecules are close together; for greater separations the force is 
attractive and becomes very small at large distances. 

We shall calculate only the second virial coefficient B(T); because of 
(23-6), therefore, we need only include terms to the order 1/V and can 
neglect terms proportional to 1/V?, 1/V3, etc. From (16-9), we see that in 
order to find rf(r) = —r(d¢/dr) for a given pair we need the probability 
that they will be a distance r apart. According to (22-7), we can take this 
probability to be proportional to the Boltzmann factor e~*/*", Therefore, 
if w(r) dr is the probability that the second molecule of the pair will be in 
the spherical shell of thickness dr at a distance r from the first so that it is 
in the volume 4zr? dr, we have 


w(r) dr = A4nr?e PFT dr (23-20) 


where A is a normalization constant determined, according to (16-10), by 


ie dr=1= A arte-ott dr (23-21) 
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(r) 


Fig. 23-3 
If we plot e~*/*” as a function of r for a potential of the general form 
shown in Fig. 23-3, we get a curve like that of Fig. 23-4. Since the integral 
over r must cover the volume V, whose dimensions are extremely 
large compared to average molecular separations, the figure shows that 
e—%kT ~ | except for a very small portion of the range of integration of 
r and therefore we can approximate (23-21) by 
[= A ar dr= AV and A~ ; (23-22) 


If we combine (16-9), (23-19), (23-20), and (23-22), we obtain 


_ (_,4¢)\) _ _ 44(* 3 dd oper 73-2 
[rf(r) ]pair ( 2) a r a e dr (23-23) 


We have been able to extend the upper limit of integration out to infinity, 
for any error introduced in this way would be of higher order than the 


ee PlIkT 


Fig. 23-4 
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terms we are keeping since (23-23) is already inversely proportional to 

1/V. It is convenient to integrate by parts in (23-23); taking account of 

the fact that our result must vanish in the absence of molecular forces 

(¢ = 0), we obtain 

| ad o-onT gy = [kr — een) — KT | r'(1 — e***) dr 
0 ar 0 0 


= -3kT | rl — e @*?) dr (23-24) 
0 


where we have used (23-19) and assumed that the potential falls off 
rapidly enough with r so that, as r > oo, 


kT — e%*7)p? _, p> > 0 
If we now use (23-24) in (23-23) and compare the resultant form of 
(23-18) with (23-6), we see that the second virial coefficient is given by 


B(T) = 20N?kT [ rl — e@*T) dr (23-25) 
0 


The result applies to a gas for which the central force potential vanishes 
sufficiently rapidly at large separation; as a check on (23-25), we see that 
B = 0 when ¢ = 0, as it should. 


Example. van der Waals Gas. Let us assume that the molecules are 
impenetrable spheres of radius $s, and that they have a weak mutual 
attraction when separated by more than the collision diameter s. 
Therefore ¢ will have the general appearance shown in Fig. 23-5 which 


‘| ft 


Fig. 23-5 
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approximately corresponds to that envisioned for a van der Waals gas. 
If we assume that ¢ « kT so that 


1—e@*T~ d/kT (r>s) 
we obtain 


[rc ~ eA) gy =| = +{"t 7 


-| r’ dr +| rl —e?*?) dr~ -@ + tf dr’ dr 
0 s 


Therefore we find from (23-25) that 
3 0 
B(T) = yo(2es 47 45 [ br? ar) (23-26) 


which is exactly of the form (23-7) obtained for a van der Waals gas 
where 


b' = 3ns*, a’ = —2n[ d(r)r® dr (23-27) 


We see that a’ will be positive for attractive forces, as we assumed in our 
original discussion in Chapter 12. 
Since‘the molecular radius is $s, we also see that 


° “A el 


and hence is four times the volume of a molecule. This is precisely the 
interpretation of b’ which we made in our initial discussion of van der 
Waals’ equation as given after (12-1). 


Exercises 


23-1. A system of mass points moves in three dimensions under the influence 
of an external inverse square central force. Show that the average value of the 
potential energy per molecule is —3kT. 

23-2. Find the second virial coefficient for a gas for which the mutual repulsion 
between pairs of molecules is described by the potential energy ¢ = const./r” 
where n > 3. 

23-3. Calculate the Joule-Thomson coefficient « for the equation of state 
(23-6), and thereby show that the inversion temperature is determined approxi- 
mately by the condition dB/dT = B/T. How can one use this condition to 
determine 7; graphically from an actual curve of B vs. T such as the dashed 
one of Fig. 23-1? 
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24 Paramagnetism and ferromagnetism 


The additional variables which are appropriate for the thermodynamic 
description of a magnetic system have been discussed in Chapter 14. In 
order to calculate them, we now assume that each subsystem possesses 
enough internal structure that it has a permanent magnetic dipole moment 
w.. As has been shown in (I: 38-19 and 23-15), the energy of a permanent 
dipole in an external magnetic field is 


En = —w- B= —uBcos 6 (24-1) 


where 6 is the angle between the directions of w and B. If.Z@ is the total 
component of the magnetic dipole moment of the system in the direction 
of the applied field, 
M =) (ucos 6), = Nu cos 6 (24-2) 
k 


since (u cos 8), is the component of the kth subsystem. 

If the energy corresponding to the ith cell is written ¢, = &) + €,, 
where €, is that part of the energy which is independent of the magnetic 
field, we have 

Zo = > e Fei _ > e Fto~ Bem 


_— > e Feot Bub cos 8 (24-3) 


a 


If we calculate 0 In Z,/0B and use (21-33) and (24-2), we obtain 


f= (24-4) 


as our basic result. 


24-1 Langevin paramagnetism 


The first calculation of the magnetization of a paramagnetic system, 
that is, one comprised of independent subsystems, was made by Langevin. 
He used the Boltzmann factor to evaluate the probability of uw cos 6 ina 
way which is intuitively very appealing and leads to the correct result. 
However, Langevin’s basic assumption is not an immediate consequence 
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of our result (21-33) for the probability of occupation of a cell in u-space, 
nor does it follow directly from the Boltzmann distribution (22-7) for the 
center of mass of the molecules. Consequently, we shall derive Langevin’s 
result in a different and more accurate manner while leaving it as an 
exercise to apply his method to this problem. 

Since our molecule possesses a dipole moment up, it cannot have spherical 
symmetry but must have an axis of symmetry as one of its properties. 
Therefore any model we adopt must also be of this type and, for example, 
cannot be a mass point. It will be sufficient for our purposes to use the 
simplest model which has this property. This is the rigid rotator or 
dumbbell diatomic molecule, and it consists of two particles at the constant 
distance a apart; the energy associated with the motion with respect to 
the center of mass is given in terms of the reduced mass m’ by 


— | (v.* 4 Po’ (24-5) 
" : sin? 6 
as can be seen from (I: Exercise 10-1). If the z axis is chosen to be along 
the external field, then 6 in (24-5) is the angle between B and the line 
connecting the two particles; since this line is the only axis of symmetry, 
it necessarily coincides with the direction of w, and hence 6 is also the same 
angle used in (24-1). 

The total energy ¢, of a particle in a cell is then found, from (22-1), 
(24-1), and (24-5), to have the form 


2 
” Po 
E; = & + ———— — “Boos 6 24-6 
: 2m’‘a* sin? 6 P ( ) 


where €é " is independent of 6 and B. We can multiply (24-3) by dQ,,/h® 
and write it as an integral where 


dQ. = dx dy dz d9 dp dp, dp, dp, dp, dp, 


and also use (24-6) for €,;; the result is that Z, has the form 
Zo a Z."| eu B cos @ a6 | e Big |2m'a® sin? @ dP» (24-7) 
0 =< 


where Z, is independent of B. If we integrate over p, with the aid of 
(18-9a), we find that (24-7) becomes 


Z, = Z,’ | ebtB 008 8 sin 9 dQ = Z,'Zp (24-8) 
0 
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where Z, is independent of B. We can evaluate Z; by setting x = cos 6 
so that 


1 
T= { Fa ee oo (24-9) 
- BuB 
If we substitute (24-8) into (24-4), note that In Z, = In Z, + In Zy, 
use (24-9) and (21-60), we obtain 


a Na = Nal etnh (4) — st] = M3) (24-10) 


B OB kT UB 
where 
a aid (24-11) 
kT 
and the function 
Ba) = ctnha — L (24-12) 
og 


is called the Langevin function; also, 
M, = Nu (24-13) 


is the value of # when all the individual dipoles are parallel to the field; 
M, therefore is the maximum possible value of the total moment and is 
called the saturation moment. 
The general dependence of 4(«) on « is shown in Fig. 24-1. As « — oo, 
corresponding to large fields and low temperatures, 
ctnh « —> = 1 
a> e 


so that we can approximate # by 


CG ee ee ee oe (24-14) 
(0 


Fig. 24-1 
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showing that B(«) — 1 as « — oo as indicated in the figure. At the other 
extreme, generally corresponding to low fields and high temperatures, 
a << 1, and if we use the series expansion for ctnh « we find that 


Aa) =F-E ee KY (24-15) 
3. 45 
showing that the initial slope of the curve in Fig. 24-1 is 
B(a)y = (=) ai (24-16) 
da /a=0 3 


If « is extremely small, we need use only the first term of (24-15), and 
(24-10) becomes 
2 2 
Ah= Le sad Nui = jae Hs) H _ GH (24-17) 
3 3kT 3k / T ‘3 
with the use of the good approximation, B~ u)H. Equation (24-17) is 
the Curie law (14-14) which we discussed earlier. Thus we have obtained 
a theoretical expression for the Curie constant 


2 
¢ —HoNu (24-18) 
3k 

It enables us to determine the magnitudes of the molecular dipole moments 
from measurements of the magnetic susceptibility of various materials. 
We also see that the Curie law is a special case of the Langevin magnetic 
equation of state (24-10) which itself is seen to be a function of the ratio 
B/T = foH|T and thus describes an ideal magnetic material according to 
(14-13). 


24-2 Weiss theory of ferromagnetism 


A notable characteristic of ferromagnetic materials, such as the metals 
iron, nickel, and cobalt, is that they can have a net magnetization in the 
absence of a field. Below a temperature characteristic of the material, the 
magnetization is very large, is virtually independent of the external field, 
and is almost independent of the temperature over a large temperature 
range. 

According to current ideas about ferromagnetism, the elementary 
magnetic moments interact very strongly with each other. Strictly 
speaking, therefore, they are not independent; hence the classical 
Boltzmann statistical mechanics we have been using should not be appli- 
cable. Weiss, however, made use of a simple artifice in order to apply 
the independent particle results even for ferromagnetic materials; in this 
manner, he obtained a surprisingly satisfactory theory. 
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Since the magnetization is relatively independent of the external field, 
we can conclude that the actual field acting on an individual magnetic 
moment is much larger than the external field. Thus we try to describe 
the over-all effect of the interactions within the system by an internal field 
H, which we assume to be proportional to the total moment: 


H,=14 (24-19) 


In (24-19), 4 is an empirical constant which is very large and characteristic 
of the material. Thus the effective field H, acting on the magnetic moment 


is to be taken as H,=H+H,=H+ 1a (24-20) 


We now use H, in the meg formulas: 


Huo, _ e _ Loko 

=i H+i14 24-21 
7 ee LT (H + 24) ( ) 
M= AM #«)= A | He PO —"(H +14 (24-22) 


The last equation is the Weiss equation of state. 
If « « 1, we can take B(a) ~ 4a by (24-15), so that (24-22) becomes 


_ Hoo 0 
H+i14 
3NkT ( ) 
which can be solved for .@; the result is 


€H 
T — 6, 


M = (24-23) 


where @ is given by (24-18) and 
6, = 1¢ (24-24) 


is called the Curie temperature. Equation (24-23) is known as the Curie- 
Weiss law; P. Curie had found that (24-23) describes quite accurately the 
experimental results for the “paramagnetic’” moment of ferromagnetic 
materials, that is, the field proportional moment above the characteristic 
temperature of the system. The value of 0, can be determined most 
easily from a plot of the reciprocal of the ratio of total moment to field 
as a function of T as shown in Fig. 24-2. We also see from (24-23) that 
the slope of this curve is 1/6. Once 6, and @ have been determined, the 
value of A can be found from (24-24). In this way it has been shown that 
the observed values of A are about a thousand times larger than can be 
accounted for by any classical theory; these large internal fields are now 
generally regarded as basically a phenomenon of quantum theory. 

When « is not small compared to unity, we can proceed differently. We 
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0. T 
Fig. 24-2 


obtain the following two equations from (24-21) and (24-22) with the use 
of (24-18) and (24-24): 

rag Ba) (24-25a) 

ee 

M, 36, AM 9 
These are simultaneous equations which must be satisfied by @/.@%,; in 
other words, if both of these equations are plotted as a function of «, the 
value of .4@/.4@, which simultaneously satisfies the equations (24-25) is 
that given by the intersection of the two curves. This graphical calculation 
is illustrated in Fig. 24-3. 

The case H = 0 is of particular interest, and, as shown in Fig. 24-4, it 
is still possible for the curves to intersect at a value of #0. In other 
words, the Weiss theory predicts that it is possible for the system to be 
magnetized in the absence of an external field as is observed for ferro- 
magnetic materials; this net magnetic moment -@, is called the spon- 
taneous moment. As T increases, the slope of the straight line (24-255) 


(24-25) 


Fig. 24-3 
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Fig. 24-4 


increases so that the point of intersection moves down toward the origin 
as is also illustrated in Fig. 24-4. Thus the spontaneous moment decreases 
as the temperature increases. We also see from the same figure that the 
spontaneous moment will be zero when the slope of the straight line 
(24-255) is greater than or equal to the initial slope of the Langevin 
function. Thus, from (24-255) and (24-16), we find that .@, = 0 when 
7/30, > 4, or 

M,=0 when T> 8, (24-26) 


Thus the Curie temperature 6, is not only the temperature at which the 
Curie-Weiss law (24-23) becomes meaningless, but it is also the highest 
temperature for which a spontaneous moment is possible. 

It will be left as an exercise to calculate the numerical values of 4,/.Z, 
as a function of 7/0, by the graphical method of Fig. 24-4. The general 
appearance of the resultant curve is shown in Fig. 24-5, and it agrees with 
the over-all trend observed for most ferromagnetic systems. 


AM, 


Fig. 24-5 


216 —_ Introductory Topics in Theoretical Physics 


24-3 Heat capacities 


Let us investigate the properties of the heat capacities of a system 
described by the Weiss theory. We previously found the difference in the 
heat capacities at constant field and constant magnetic moment to be 
given by (14-21) as 


dH\ (0M 
Caan Cs op IY) [o 24-27 
i a (=) _( *\, ee!) 


We shall have to evaluate the required derivatives indirectly since we have 
not obtained an explicit expression for “4(H, T). 

If we differentiate (24-22) with respect to T while keeping -Z@ constant. 
we obtain 


0= B'(a) (=), (24-28) 
where 
B'(a) = = (24-29) 


Therefore, from (24-21) and (24-28), we obtain 


(=) = UboH , a [Uy 9 (=) 
OT /a kT? NkT \oOT/" 
_ = 4 Hots Uo M o (22) 

NkT \OT 
so that 


(=) == (24-30) 
OT)" Uy 


Similarly, we differentiate (24-22) with respect to 7 at constant H to obtain 


(ar), = “9 lst), 


MA (OM 
ton $082 A) 
‘ | at ee NkT \ dT) 


The result of solving this equation for (0.@/0T),, and using (24-18) and 

(24-24) is 

(=<) _— My B"(«) 
H 


(24-31) 
OT T — 30,8'(«) 
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If we substitute (24-30) and (24-31) into (24-27), we obtain 
Nko? B'(a) 
oF ETT EN 
1 — 3(0,/T)B'(«) 
We shall consider only the case H = 0, # = .@,, and therefore 


Cee (24-32) 


ad (24-33) 
NkT 
by (24-21). 

If T> 6,, H@, = 0 by (24-26) and therefore « = 0; we then find from 
(24-32) that Cy, = Cy. Thus, above the Curie temperature, the heat 
capacities are equal in the absence of an external field, as we would expect 
for this unmagnetized state. 

We now want to evaluate (24-32) for temperatures less than the Curie 
temperature yet very close to 6,, that is, T< 6,and T= 6,. We see from 
Figs. 24-4 and 24-5 that 4, =~ 0, so that « « 1. We can accordingly use 
the following approximations obtained from (24-22), (24-15), (24-24), 
(24-18), and (24-33): 


When these expressions are substituted into (24-32), we obtain 


Gee Cg = 2 NK(I -<) —+ 2? Nk (24-34) 
2 Sf a7: 2 

Therefore the heat capacities differ by $Nk immediately below the 
Curie temperature and are equal right above this temperature. Similar 
results can be obtained for the more general situation in which H # 0. In 
other words, there is a discontinuity in the heat capacities at the Curie 
temperature. Experimentally, jumps in the heat capacity of this order of 

magnitude are indeed observed for ferromagnetic materials. 
The Weiss theory thus provides us with an example of a second order 
phase transition since we previously saw after (15-30) that such a transition 
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is characterized by a heat capacity discontinuity. This type of transition 
is also called an order-disorder transition because the system changes from 
the highly ordcred ferromagnetic state in which the elementary dipoles are 
essentially parallel to the very disordered “‘paramagnetic”’ state in which 
the dipoles are randomly oriented. Since the disordered state corresponds 
to many more possible microstates, it also has a greater entropy so that 
the heat capacity must rise suddenly as the transition point is approached 
in order to produce the large entropy change which is required. 


Exercises 


24-1. In a simple derivation of Langevin paramagnetism, One assumes that 
the probability dW that the direction of the dipole p is in the solid angle dw = 
sin 6 dO dp and hence that pw has the z component cos 6 is given by dW = 
const. e /&m dw. Show that this assumption also leads to (24-10). Why does 
this method give the correct answer? 

24-2. Calculate the numerical values of .4,/.M 9 as a function of 7/4, and thus 
construct an accurate version of Fig. 24-5. 

24-3. Show that the Weiss theory of ferromagnetism violates the third law 
by showing that (2.4,/dT)p—o9 # 0. 


25 The equipartition theorem 


and applications 


The question of how the total energy of a system in equilibrium can be 
associated with average contributions from various aspects of its motion is 
partially answered by the famous general result of classical statistical] 
mechanics called the equipartition theorem. After we have obtained this 
result, we shall apply it to a few examples whose diversity will illustrate 
anew the over-all power and applicability of statistical mechanical 
methods. 


25-1 Separate contributions to the heat capacity 


If we let & be the average energy of one of the independent subsystems, 
we see from (21-36) that 


(25-1) 
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We can also define a subsystem heat capacity ¢, by 


s =) 
¢—.=(— 25-2 
f= V ore 
so that the heat capacity of the whole system is given by 
C, = Ne, (25-3) 


Frequently the energy e, can be written as a sum of terms each of which 
involves variables which do not appear in any other term; that is, e, has 
the form 

Ex = €q(Par Ja) + (Po I) + ° °° (25-4) 


where (p,, 9,) stands for the whole set of variables in €,, (p,, 9,) is the set 
of different variables in ¢,, etc. 

Writing the partition function as an integral over u-space in the usual 
way by means of (22-20), we see that it can be written as a product: 


Zo —_ 1 pene eg dp, oY dq, 


Ei 
sare = 


7 (25-5) 
where 
Z,;= few (dq; dp;) (25-6) 


and involves an integration over all the variables involved in ¢; as sym- 
bolized by (dq; dp,;). Substituting (25-5) into (25-1), we see that € becomes 


ey Ue ALL 2 ee ae ey eee (25-7) 
and is in the form of a sum of contributions from the individual terms in 
(25-4). Similarly, we find from (25-2) and (25-7) that the heat capacity 
can also be written as a sum of separate contributions: 


: 0é *) 2 _ 
— (2a Oey oa Sark 95-8 
(2). us (= 4 ? eas ( ) 


Example. Linear Harmonic Oscillator. We see from (20-3) that e can 
be written in the form (25-4) as the sum of a kinetic energy term and a 
potential energy term. From (25-6) and (18-9) it is seen that 


ey 


Z =|" e fp?/2m dp = (= 
a a B 


- 2 72 2 “ 
nef ora (2 
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and, therefore, 
InZ, = —4ln B + 31n (2am) 


so that 
OlnZ 1 
cP =-6é6. = — Y= = 25- 
Similarly, 
Ep — Exot = skT, c. kin — oe pot = 4k (25-10) 
and, because of (25-7) and (25-8), 
E=kT, (=k (25-11) 


This example shows that the kinetic and potential energies of the 
oscillator contribute equally to its average energy; it is also a known 
result of mechanics that the time averages of the kinetic and potential 
energies of an oscillator are equal. 


25-2 Equipartition theorem 


It follows from (21-33) and (21-35) that 


(25-12) 


is the probability that a subsystem will be found in the ith cell of volume 
dQ, = h' in w-space. 
Let us consider the following average: 
Oe pe 
Piz ==> [Piz e “4 ap dqy:* dq, — (25-13) 
where 


h'Zy = | i Wap.ecodo aa steedy, (25-14) 


If we first integrate by parts with respect to p,, noting that, for this purpose, 


the integral in (25-13) becomes 
is es aq 3 ee] 2 : fer. Sere ag, * dp, (25-15) 


Usually a momentum ranges from —oo to +00, and e— o at these 
limits; consequently e~”* = 0 at the limits Pp, wax aNd P; min» SO that the 
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first integral in (25-15) vanishes. All that remains in (25-15) is the second 
integral which equals h'Z,/B by (25-14); substituting this resultant value 
for the integral of (25-13), we find that 


ELA oe 5 1 (25-16) 


The same result would have been obtained for any choice of p,, and 
therefore we can say in general that 


Py = kT (25-17) 
OD; 
Proceeding in exactly the same manner, we find that 
a (25-18) 
0q; 
provided that 
Qj max 
fae *] = 0 (25-19) 
Qj min 


The last condition will be valid for all the cases to which we shall apply 
(25-18); for example, from (20-4) it is easily seen to hold for the oscillator 
and also for the free particle in a box according to the discussion in 
connection with (22-8). 

In order to apply these results in a general manner we shall let p,,.. ., 
Pu | -- +> 4. be denoted by 2, %, .. . , Y_, Which are chosen in an appro- 
priate order as shown below. Let us suppose that 


exze te’ (25-20) 
where 


end Sa Aig (Letis- + + » Vay)X Ly (f < 21) (25-21) 


and where e” = e"(2,,3,..., a1) and, in addition, e” has no quadratic 
parts. Therefore the variables z,,..., x, appear only quadratically in the 
energy expression, while the rest do not appear quadratically. In all 
practical cases, e’ will be a positive quadratic form, so that ¢— oo if 
x, > +00 and (25-17) and (25-18) therefore apply toe. Ifj=1,2,...,f, 
then it follows from (25-20) and (25-21) that 


7) 
5 85, ae x(a, n+ Dayt,) 


and therefore 


j 


f Of 

€ ; 
Diam = DL sti 2; + YD Ay Xjt% = 2 
pi Or, id ik 
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If we solve for e’ and use (25-17) and (25-18), we obtain 


fkT (25-22) 


which is our desired result. It can also be stated as: 


EQUIPARTITION THEOREM. Any dynamical variable (p or q) which 
appears in the energy e of a subsystem quadratically, and only quadrati- 
cally, contributes 4kT to the average energy é and, therefore, $k to the 
subsystem heat capacity ¢,. 

This theorem is a purely classical result since it depends on our use of 
integrals to evaluate the averages in u-space. 


Example. Monatomic Gas. According to (22-8), 


1 
é = (Pz + Py + P,’) 
2m 
in the container. Since there are three dynamical variables entering 
quadratically, we obtain 


&é=8kT, ¢,= 3k, U=48NkT, c,=2R 


v 


which agree with our previous results (17-14) and (17-17) which were 
obtained in a more detailed manner. 


Example. Rigid Rotator. If we neglect external forces such as gravity, 
we find from (22-1) and (24-5) that 


2 


c= (Pe +p, + B+, = (n' +2 (25-23) 
sin” 6 


Thus we have five variables entering in the proper way; the coefficient 
of p,” is independent of the variables which do appear quadratically in 
(25-23). Therefore 


Cp 


c aR p= Fras) 4,40 (25-24) 
2 2 5 


These results agree very well with the measured heat capacities of 
diatomic gases. In this example, we could say that é has a translational 
contribution of 2k7 and a rotational contribution of 3kT. 
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25-3 Monatomic crystals 


We consider only the crystal type for which there are N atoms, rather 
than molecules, at the lattice points; an example of such a crystal is 
a diamond. In a real crystal there are large forces binding the atoms 
together; hence we certainly cannot consider the N atoms to be inde- 
pendent subsystems. However, if the displacements of the atoms from 
their equilibrium positions are not too large, we can determine an appro- 
priate choice of subsystems by using some of the results from the mechanics 
of coupled systems as discussed for example in (I: Chapter 12). 

For small displacements u,, the potential energy of the system has the 


form 
V=}3 > V5, ju), 
74k 


where the v,, are constants. If one introduces the normal coordinates ¢,, 
the potential energy takes the form 


V= D406; 


where the w, are the normal (circular) frequencies of the coupled system. 
The Lagrangian then has the form 


L= > 4 (6, — 0;°C;') 
J 
and the Hamiltonian becomes 
H = 4(p? + 7£,) (25-25) 
J 


where p; is the generalized momentum conjugate to the normal coordinate 
¢,;. Therefore we see from (25-25) that the coupled system is completely 
equivalent to a collection of independent harmonic oscillators. In other 
words, the independent subsystems are the normal modes of the coupled 
system; the Hamiltonian (25-25) is of the form we assumed in (21-1), so 
that all our results are applicable to the monatomic crystal considered as 
a superposition of its normal modes. 

We have seen in (25-11) that the average energy of an oscillator with its 
two quadratic variables is KT, so that the average energy of the jth normal 


mode of the crystal is 
&, = kT (25-26) 


For the N particles, 3N coordinates will be needed. Of this number, 
three will describe translations of the crystal as a whole, and three the 
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Fig. 25-1 


rotation. Hence there will be 3V — 6 coordinates describing the oscil- 
lations, so that there will be 3N — 6 normal modes of the crystal, each of 
which will contribute kT to the total energy U. Therefore 


U = (3N — 6)kT ~ 3NKT (25-27) 


Using (17-14’) and (9-14), we find the molar energy to be u = 3RT, so that 
the molar heat capacity is 


c, = 3R = 6 kilocalories/kilomole-degree (25-28) 


The last result is the famous empirical law of Dulong and Petit, and we 
see how it is accounted for by classical statistical mechanics. The theory 
predicts not only that these solids should all have about the same heat 
capacity but also that c, should be independent of temperature, and this is 
found to be approximately true. However, (25-28) cannot be completely 
correct because it violates the third law of thermodynamics from which 
we found that c, should vanish as T—> 0; the experimental values of c,, 
for solids depend on temperature in the manner shown in Fig. 25-1 and 
begin to decrease from the Dulong-Petit value (25-28) at a temperature 
characteristic of the material. We shall return to this problem in the next 
chapter. 


25-4 Thermal radiation 


The next application of the equipartition theorem will be to a system 
which is a continuum, in contrast to the systems of discrete particles 
considered up to this point. The system consists of the electromagnetic 
fields in a box and in equilibrium with the walls of the box which are at 
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the temperature 7. In this way we can ascribe the temperature TJ to the 
electromagnetic energy, which is accordingly called thermal radiation or 
black-body radiation. 

First of all, we shall use the second law of thermodynamics to show that, 
when a body is in temperature equilibrium with the radiation in the box, 
the radiation can be described by giving the energy density for the various 
frequencies, and that this energy density is a universal function of the 
temperature only. 

Let us consider a box with perfectly reflecting walls which neither emit 
nor absorb radiation. The electromagnetic energy in the box is therefore 
isolated from its surroundings. Suppose we now assume that two different 
objects (1 and 2) are placed in the box and kept at the same temperature 
T. In addition, let us assume that we can place a filter between the objects 
so that the box is divided into two portions as shown in Fig. 25-2. The 
filter is assumed to be perfectly transparent for a single frequency » and is 
perfectly reflecting for all other frequencies. 

First we assume that the box is divided by a screen that is perfectly 
reflecting for a// frequencies, thus isolating the two bodies from each other 
in their own portions of the box. Each body will separately absorb and 
emit radiation until it reaches equilibrium, that is, until the energy density 
about it is high enough that it absorbs as much energy as it radiates. Let 
us assume, for example, that the energy density of the radiation of fre- 
quency v in equilibrium with body | is greater than for 2, that is, u,, > uy,. 
Were we now to imagine the filter for frequency » to be put into position, 
there would be a net flow of electromagnetic energy through this screen 
from side 1 to side 2. As this would decrease u,,, body 1 would emit more 
than it absorbs while the converse would be true for body 2. The net 
effect of the introduction of the filter would therefore be a transfer of 
energy between two objects at the same temperature without any work 
having been done in the process. Since this would violate the second law, 


Fig. 25-2 
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we conclude that our initial assumption that u,, > ug, is invalid. Similarly, 
we would find that u,, > u,, is not possible; therefore the only possibility 
iS Uj, = Uy, = U,, SO that the density at frequency » is independent of the 
nature of the bodies in equilibrium with the radiation. 

Since we can assume a filter for each possible frequency, we shall obtain 
the same result for every frequency. Therefore if we define a density 
function u by 


Electromagnetic energy per unit volume with frequencies 
in the range » to » + dy which is in equilibrium at 
a temperature T = u(y, T) dv (25-29) 


we know that u(v, 7) is a universal function of frequency and temperature. 
Our aim now is to calculate this function. 

We can get an idea about how to proceed by recalling the results for 
the monatomic crystal; we introduced the normal modes and used the 
fact that the Hamiltonian of the system then corresponded to a collection 
of independent oscillators. Similarly, it is found in the study of cavities 
and wave guides that the electromagnetic fields in an enclosure can be 
written as superpositions of independent normal modes as discussed, for 
example, in (I: Chapter 30). Each normal mode in an enclosure is a 
standing wave which varies with time as e~?"™” and thus is equivalent to a 
one-dimensional harmonic oscillator. Therefore we can again treat this 
system as a collection of independent harmonic oscillators. As we shall 
see below, if the box is sufficiently large, the frequencies are very close 
together and will have an almost continuous distribution. Accordingly, 
the principal problem in finding u(y, 7) dy is to find the number dZ 
of independent normal modes between » and » + dy since then we shall 
have the total energy in the frequency range given by 


u(r, T)V dv = &(r),., aZ (25-30) 


where V is the total volume. 

If we consider a rectangular box of sides A, B, C whose walls are 
perfectly conducting and therefore perfectly reflecting, the frequencies of 
the normal modes are given by (I: 30-14) as 


ET Gheeh osm 


where m, n, and p are positive integers. Let us first find the number of 
modes whose frequency is less than a given frequency », that is, the 
number of possible sets of m, n, p which fulfill this condition. If we plot 
the possible values of m, n, p in a set of rectangular axes with these indices 
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as coordinates, we get a set of lattice points, each point corresponding to 
a normal frequency. The points which lie in the mn plane are shown in 
Fig. 25-3a. If we write (25-31) in the form 


aes (=<) 7 (4) = (23-32) 


we see that the surface of constant v in the mnp space is an ellipsoid with 


is 
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semiaxes: 2Ay/c, 2By/c,2Cv/c; this ellipsoid is shown in Fig. 25-36, and its 
projection on the mn plane in Fig. 25-3a. Since m, n, and p are integers, 
the coordinates of the lattice points differ by unity; hence the volume 
corresponding to each lattice point is 1 x 1 x l= 1. 

If we let NV, be the number of normal frequencies which are less than the 
given frequency », we see that N, equals the total volume enclosed by the 
constant frequency surface divided by the (unit) volume associated with 
each lattice point. Since m, n, p are all positive, the volume involved is 
one-eighth that of the whole ellipsoid and therefore 


1 4 (*) (72) (2) 
Nia a eX ee 
8 3 Cc Cc. ae 
so that 


N = 4a Vy? 
3c3 


where V = ABC is the total volume of the box. The number of normal 
frequencies in the range dy is the differential of (25-33): 


(25-33) 


dN, = at? dy (25-34) 
. 


For electromagnetic waves, however, the number of normal modes is 
twice as great as (25-34) because there are two independent directions of 
polarization which are possible for each standing wave of a given frequency, 
as discussed in connection with (I: 30-19). Therefore we find from (25-34) 
that 
_ eV, 


ce 


dZ dv (25-35) 


If we substitute this result into (25-30), we find the total energy in the 
frequency range v to v + dy to be given by 


U(y, T) dv = u(y, T)V dv = Sa PEC) ose dy (25-36) 
: 
so that the energy density per unit frequency interval is 
2 ———— 
u(y, T) = dl E(V ose (25-37) 
c 


If we substitute into (25-37) the value kT found for the average energy 
of an oscillator in (25-11), we obtain 


2 
u(y, T) = = kT (25-38) 
F 
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Equation (25-38) is called the Rayleigh-Jeans law, and we have derived it 
as an application of the equipartition theorem by substantially the original 
method. This law is in good agreement with experimental results at low 
enough frequencies, but it breaks down at high frequencies since we see 
that u — 00 as vy > o, which is physically impossible. 

The examples of the heat capacity of solids and of the energy distribution 
of thermal radiation show that our statistical mechanics is not yet com- 
pletely correct. We have obtained results which agree well with experiment 
in a certain range of the variables but become quite incorrect elsewhere. 
What we have left out of our considerations is the fact that it is quantum 
mechanics rather than classical mechanics which is the appropriate 
scheme to be used for the description of atoms and molecules. In the next 
two chapters, we show how the results of quantum mechanics can be 
incorporated into the theory of statistical mechanics. 


Exercises 


25-1. Show that the entropy of a system of linear harmonic oscillators as 
obtained from (21-64) is 
S = Nk In (ekT/[hv) 


where v = w/2z7. Show that, if (21-53) were used, the entropy would not have 
been extensive. What are the physical reasons for treating a system of oscillators 
as distinguishable? 

25-2. Show that the quantities analogous to (25-17) and (25-18) are 


ad _ Pe pe kT 


for relativistic particles where the energy dependence on momentum is given 
by (22-27). Use these expressions to find the heat capacity per molecule of 
an ideal gas in the extreme relativistic limit, and compare with the results of 
Exercise 22-3. 


26 Classical quantum statistical mechanics 


A complete discussion of the fundamentals of quantum mechanics will not 
be necessary for our purposes, and we shall mention only those principal 
results which are essential. We have already seen that the effect of the 
uncertainty principle is to make it necessary to choose the size of the cells 
in “-space with the definite volume /’ as given by (22-20) where / is the 
number of degrees of freedom of a subsystem. 
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Another fundamental result of interest to us is that the energies of the 
system can no longer have any arbitrary value, as was possible in principle 
for classical mechanics. According to quantum mechanics, only certain 
states are possible for the system and a definite energy corresponds to each 
possible state. For the systems of most interest, these energies are usually 
discrete. Another useful result is that, in general, there exists a state of 
lowest energy for a given system. These allowed energies are calculated by 
the specific methods of quantum mechanics; the results for the only two 
cases which we shall need to consider are given below. 


Free particle in a rectangular box of sides L,, L,, L, 


The classical energy as given by (22-1) is 


1 
C=. (p,” + Py. F P,’) (26-1) 
2m 

The possible quantum mechanical energies are given by 

m Lae) * (is) +e) 
En non, = || — — 26-2 
nme 8m L) * \z,) * \z, cola 

where, = 1, 2,346) My = 1,2, 3.0285 Ho, 2.3 a 


One-dimensional harmonic oscillator 


The classical energy as given by (20-3) is 


e= Ps + : mw°q* (26-3) 
2m 2 
The quantum energies are 


e, =(n+4)h - = (n+ 4)ho (26-4) 


where n = 0, 1, 2, 3,.... 


26-1 The quantum partition function 


The partition function Z5, which is so necessary for us, is defined by 
(21-35) as a sum over the cells in u-space. However, from the quantum 
mechanical point of view, the only fundamental features of our subsystems 
which we can use are the possible states, and we can no longer describe the 
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subsystems by giving their coordinates in u-space. We must somehow be 
able to make a correspondence between the cells and the states. It is 
extremely difficult to do this rigorously; instead, we shall consider a 
simple example which shows quite clearly how this correspondence is 
obtained and which gives the correct result. 

We saw in (20-4) and Fig. 20-15 that the surfaces of constant energy of 
the one-dimensional harmonic oscillator are the ellipses of semiaxes J 2me 
and J 2e/mw*. The area enclosed by a given ellipse is 


A =7,/2me se ey aie (26-5) 
Mm@®@ 60) Vv 
According to (26-4), however, the oscillator energies are restricted to 
definite values and thereby correspond to particular ellipses in u-space. 
In Fig. 26-1 we show the ellipses of areas A, and A,,, corresponding to 
the energies e, and e,,,. The difference in energy between these two 
states can be written as 


Ent — €n = hy = (Ansi _ A,)v (26-6) 


with the use of (26-4) and (26-5). Therefore the difference in area enclosed 
by these ellipses in “-space is 


Anyi = A, = h (26-7) 


and is shown shaded in the figure. We also know from (22-20) that h is 
equal to the cell volume for a system of one degree of freedom. Conse- 
quently, as we go from one quantum state of the oscillator to the next, we 


Fig. 26-1 
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increase the area enclosed by the ellipse by exactly the size of a cell in 
classical w-space. In this way we can correlate each state of the oscillator 
with a cell in u-space; the cells do not have the rectangular shapes we 
have been thinking about, but the size, not the shape, is the essential 
aspect of the argument. 

Extending these results to a general system, we conclude that: Each 
quantum state of a subsystem corresponds to one of the elementary cells of 
volume h'. Thus we see that a sum over cells in classical u-space is 
equivalent to a sum over subsystem quantum states, and hence we can 
write (21-35) as 

Z) = > e Pet = > e Pe 


cells quantum (26-8) 
states 


In other words, we can continue to write 


Z) = De (26-9) 


a 


except that now the index / labels each quantum state with corresponding 
energy e€;. Similarly, all our basic formulas (21-36), (21-61), and (21-62) 
will remain unchanged and we can proceed by the same general methods 
as before. 

As in classical mechanics, it is possible for quantum mechanical results 
also to have the property of degeneracy, that is, there can be several 
distinct states which have the same energy; also as in the classical case, 
degeneracy is generally a result of the particular symmetry possessed by 
the system. Often when there is degeneracy, it is convenient to group all 
the states with the same energy and to speak of this group as an energy 
level; the degree of degeneracy g, is defined as the number of states in the 
level j with energy e,. It is clear that the sum over states (26-9) can also be 
written as a sum over levels as 


Z=Dde"=) vie (26-10) 
i j 


Occasionally, it may be preferable to write the results in the latter way. 


26-2 Subsystem with two states 


As a simple but useful example, we consider a subsystem which has 
only two states with energies, 


Ey €&=&+A (26-11) 
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Substituting these energies into (26-9), we find that 


Zo =e P+ e Fee eo Fey] 4 @ FA) (26-12) 
In Z, = —Be, + In(1 +e *) (26-13) 
We obtain the energy from (21-36) as 
N Ae 4/kf 
U=U,+ 14 eae? (26-14) 
where 
Uy, = Ne, = zero point energy (26-15) 


The name “zero point energy’ is used because U > U, as T 0 since 
e AKT _. 0 


In the high-temperature limit, we have 


= K1, eAkT m1 — A (26-16) 
kT kT 
and (26-14) becomes 
1 A 
U~U sNa(i- =) 26-17 
ot 2 2kT ( ) 


so that U — U, approaches 4NA for very high temperatures. The 
general dependence of U — U, on T is shown in Fig. 26-2. 

We can understand why the energy approaches this limiting value by 
looking at the populations of the states. We find from (21-37), (26-11), 
and (26-12) that 

N Ne AKT 


hy Se n= = 
1+ @ ART’ [eo 


(26-18) 


NA 


U- Up 


Fig. 26-2 
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from which we see that, as T—>0, n, > N, and n, — 0, so that all sub- 
systems are in the lower state. For very high temperatures, we can use 
(26-16) and see that, as T— oo, 


ny—>4N, ne—>3N (26-19) 
and the states become equally populated. Thus the energy becomes 


U = nye, + Maes > 4N(e, + &) = Ne, + 4NA 


as found in (26-17). The populations of the two states are shown as 
functions of T in Fig. 26-3. 
The heat capacity of the system can be found from (26-14) as 


oU aU A/kT VY _aper 
c= (orl (or perme) oO" 2820 
which has the limiting forms: 


A A : —A/kT 
=>»:  C,~ Nk(—) e447 +0 (26-21) 
kT kT T-0 

2 
See. Cz ne (=) —>0 (26-22) 
kT 2kT]) T« 


We see that the heat capacity of this system now satisfies the third law of 
thermodynamics since it vanishes at absolute zero. Since C, vanishes at 
both the upper and lower limits, it has a maximum which is found to 
occur at a temperature of about A/k; Fig. 26-4 shows the general tempera- 
ture dependence of C,. The physical reason for the vanishing of the heat 
capacity at high temperatures can be seen with the help of Fig. 26-3. 
Since the populations of the states become almost equal, any temperature 
increase has an almost negligible effect in changing the relative popula- 
tions of the states; thus there is correspondingly almost no change in the 
energy, and this means that the heat capacity continually decreases with T 
and eventually is zero. 


Fig. 26-3 
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Cy 


A T 
k 


Fig. 26-4 
Finally, we can briefly look at the entropy of this system. We find from 
(21-61), (26-13), and (26-14) that 


Ss (A/kT)e 4** 


—A/kT & : 
ne i 4 o Aer +In(1 +e ) + In (<) (26-23) 


Using (26-16), we find the high-temperature value of (26-23) to be 


Senin (26-24) 
To N 


It is instructive to compare this result with our basic beginning formulas; 


if we insert (26-19) into (21-51) and (21-14), we find that 


> = N—4NIn3N — 4NIngN = Nin 


which agrees exactly with (26-24). 


26-3 Monatomic ideal gas 


The partition function as obtained from (26-9) and (26-2) is 


Z= > & (Bh? /8m)[(ne/ Lz)” + (my/ Ly)” + (nz/Lz)"] 
Nar Ny, Nz 
= Z,Z,Z, (26-25) 
where 
al 22 2 3 7 
Z, = >) e Bhiny /8mLy > e oy = > eos An, (26-26) 


ny~l nj nj 


236 = Introductory Topics in Theoretical Physics 


= . 2 
ea Sin; 


and where 


h? 
Viens (26-27) 
*  8mL; 


We were able to perform the last step of inserting the An, in the sum in 
(26-26) since n; takes on only integral values and therefore An, = 1. 

If the temperature is high enough, the box large enough, and m large 
enough, ¢, is small according to (26-27); therefore e~ "5° will not change 
very much as 1, changes in unit steps as is shown in Fig. 26-5, and we can 
approximate the sum (26-26) by an integral over n. In this way we obtain 


ae y y 
Z,~ { eo” dn = (=) a LL (26-28) 
0 4¢; h 


with the aid of (18-9), (26-27), and (21-60). Substituting (26-28) into 
(26-25) and noting that V = L,L,L, is the volume of the box, we obtain 


Fe >; (2nmkTY* (26-29) 


which is exactly the result (22-10) previously obtained by the classical 
calculations, and consequently (26-29) will lead to all our previous results, 
such as the equation of state (22-15), for ideal gases. The result (26-29) 
justifies again the use of h’ as the cell size in u-space since h entered 
through the quantum mechanical energy (26-2). 

We note, however, that, if Tis small, ¢; as given by (26-27) will be large 
and it will not be permissible to replace the sum by an integral as we did in 
(26-28). Consequently, we see that the classical result (26-29) cannot be 
applicable at very low temperatures and we need to use a more exact 
calculation. We shall return to this question later. 
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26-4 One-dimensional harmonic oscillator 


The partition function for this case is found from (26-9) and (26-4) to be 
o7 VeBhy 


oT (26-30) 


© a0 
Z = > ge Aint yadhy = e~ “bis e Bnhy = 


n=0 n=0 
since the second sum in (26-30) is an infinite geometric series whose sum Is 
given by 


Sosa (ze <1) (26-31) 
n=0 ¥ 0 


It is sometimes convenient to define a characteristic temperature 6 of 
the oscillator by 


d= = (26-32) 
so that 
6 
hy = — 26-33 
B = (26-33) 
Therefore Z, can also be written 
7 92T 
From (26-30), we obtain 
In Z, = —3Bhy — In(1 —e *”) (26-35) 


so that the average energy per oscillator as found from (25-1), (26-35), and 
(26-33) is 

hy hy k6 k6 
@) ely aes | 9) e/Tt 3 


Similarly, the heat capacity per oscillator as found from (26-36) and 
(25-2) is 


(26-36) 


= 


2 O/T 
T/ (e/F — 1)? 


The entropy per oscillator, 5 = S/N, can be obtained from (21-64), (26-35), 
and (26-36); the result is 


(26-37) 


k(0/T) 


per eT) (26-38) 


= 


We can see the nature of these results more clearly by looking at the usual 
extreme cases. 
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High temperature (T > 0) 


Since 6/T « 1, we find from (26-36) through (26-38) that 


&~ kT + [terms of order (0/T)?] ~ kT (26-39a) 
ee K(i + 4 hip (26-39b) 
T 
ne eS on (26-39¢) 
6 hy 


These results agree in their limiting values for T — 00 with those of (25-11) 
and Exercise 25-1 found from the classical equipartition theorem. 


Low temperature (T < 6) 


Similarly, since 6/7 > 1, we find that 


E~ 4kO + kOe 9/7 — bkO = hhy (26-40a) 
f) 2 
C= K(?) e V/T_,_Q (26-40b) 
T 
oe 
5j~k = e — 0 (26-40c) 


Thus the energy approaches the zero point energy while the heat capacity 
becomes zero in agreement with the third law. The low-temperature 
limiting value of the entropy given by (26-40c) also agrees with the value 
zero required by the third law; this last result also shows again that the 
only imaginable system composed of linear oscillators is one in which 
each oscillator has its own equilibrium position so that they are localizable 
and hence distinguishable. 

We have seen in this example of the oscillator that the introduction of 
the results of quantum theory has gone a long way toward resolving many 
of our previous difficulties. At high temperatures the results agreed with 
all our calculations based on classical mechanics. At low temperatures, 
where the classical results were in violent disagreement with experiment 
and the requirements of the third law, the results obtained with the use of 
quantum theory agreed with the third law. We shall now go on to show 
that our results also lead to much better agreement with experiment as well. 
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26-5 Heat capacity of monatomic crystals 


We saw in our discussion of (25-25) and (25-27) that use of the normal 
modes enables us to treat the solid as a collection of 3N independent 
linear oscillators. Therefore, if é; is the average energy of the jth normal 
mode of frequency »; (= w,/27), the energy of the system can be found 
from (26-36) to be 


hy, hy, 
UH 948,23) — 4+ ¥ 26-41 
eee ae a 
and the heat capacity as obtained from (26-37) and (26-33) is 
hy. 2 eh vilkT 
C= 36 = K(™) —__—_____ 26-42 
2 ; 2 kT] (ek? — 17° ( ) 


The sums can be evaluated once the frequencies of the normal modes of 
the crystal are known. In principle, the v; can be found from the knowledge 
of the mechanical properties of the crystal lattice by the methods appro- 
priate to coupled mechanical systems as discussed in (I: Chapter 12). 
Such a program is quite difficult to carry out, but it has been done for 
some cases by Blackman and others, and the results obtained in this way 
from (26-42) agree very well with experimental results. We shall not 
discuss this method any further, but instead we shall turn to two approxi- 
mations to (26-41) and (26-42) which have been historically very impor- 
tant and are still very useful for many purposes. 

The first is due to Einstein, who made the very simple assumption that 
all the frequencies , are the same and equal to v; this would be the case if 
the N atoms oscillated independently with this same frequency. If we 
evaluate (26-42) for a kilomole so that there are 3N = 3L terms in the sum, 
and use (17-14’), we obtain 


hy 2 el v/kT . 

Cy = 3R( =] (aT 4) (26-43) 
We know from (26-395) that this result gives the Dulong-Petit value 3R 
as its high-temperature limit, and also vanishes as T — 0 according to 
(26-405) and in that sense is better than the classical result (25-28). The 
general dependence of c, on 7 as given by (26-43) is like that shown in Fig. 
25-1; however, the low-temperature behavior of c, obtained from (26-405) 
is much too rapid a decrease with temperature to agree generally with 
experiment. Nevertheless, the qualitative success of the Einstein theory 
led Debye to formulate his more accurate theory which we shall consider 


next. 
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Since there are so many normal modes, their frequencies will be almost 
continuously distributed; if we let dZ, be the number of normal modes of 
the crystal with frequencies in the range v to vy + dy, we can write (26-41) 
as the integral 


U = {&) dZ, -{(3 hy + tle dZ, (26-44) 
2 ev _ | 
In order to devise a theory of universal form, Debye obtained dZ, by 
treating the crystal as a continuous isotropic solid whose normal modes as 
given by the theory of elasticity are the transverse and longitudinal 
elastic waves in the body. 

In our discussion of thermal radiation following (25-31), we have 
already obtained the essential results we shall need. Although we were 
considering the frequency distribution of standing electromagnetic waves 
in a box, the essential feature of (25-31) is that it is a consequence of the 
boundary conditions that the waves had to satisfy at the surfaces of the 
box. One could do a similar calculation for the elastic body by solving 
the wave equation subject to the boundary conditions that the faces of the 
solid be free or held fixed. The calculation of the number of normal modes 
dN, of a given type in the frequency interval dy would go exactly as before 
and the result (25-34) would again be obtained where c is now the speed 
appropriate to the wave. For the elastic body there can be longitudinal 
waves of-one possible polarization and speed c, so that their number dZ, 
is found from (25-34) to be 


dZ, =— ~ v dv (26-45) 
In addition, there can be transverse elastic waves of two possible polariza- 
tions and speed c, so that their number is 


SaV 2 


dZ, = —~ v' dy (26-46) 
Ct 
making the total 
dZ, = dZ, + dZ, = anv (= + =) v dy (26-47) 
Ci Ct 


A continuum with its infinite number of degrees of freedom will have an 
infinite number of normal modes; our solid, however, has only 3N 
normal modes. In order to take this fact into account, Debye assumed 
that (26-47) holds up to a maximum frequency », which is so defined as to 
make the total number of modes equal to the actual number, that is, 


| "dZ, =3N = “r(4 5) »,3 (26-48) 
0 


C, C; 
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If (26-47) and (26-48) are combined, we pbtain the simpler expression 


dZ, = (26-49) 
Vy 
so that (26-44) becomes 
9Nh | “(E y? ) 
v= — of dy 26-50 
Ms 0 2 ePhy > 1 ( 
If we define the Debye temperature © by 
@ = (26-51) 


k 


and replace v by the dimensionless variable x = Bhv = hy/kT, we find 
that (26-50) becomes 


U = > Nk@® + sNkT D(=) (26-52) 
where 
a 3 
D(«) = S | aE (26-53) 
x 0 e* — 1 


is called the Debye function. This function D(«) can be evaluated by 
numerical methods, and tables of its values for various « are available. 

According to its definition (26-51), © should be a definite constant for a 
given crystal and should be independent of the temperature. Since 
different crystals will have unlike elastic constants and therefore c, and c, 
will vary from crystal to crystal, we can expect the values of © to depend 
considerably on the nature of the particular solid. 

In the high-temperature limit, « = 0/T<« 1. Therefore, in (26-53), 
xz «1 over the whole range of integration and we can replace the de- 
nominator in the integrand by the expansion 


em*—I rr he4+4eg+::: 
If we then use the expansion 
ee ee 
I+y 


with y = 3x + $2, keep all terms to the order of x‘ in the integrand of 
(26-53), and integrate term by term, we find the following approximation 
for the Debye function: 


3 a” 
D(a)~ 1 -—- — eee « | 26-54 
(x) oe (x ) ( ) 
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If we insert (26-54) into (26-52) and evaluate the energy for a kilomole so 
that Nk = R, we find 

3RO? 

20T 


2 
pe (=) ey (=) fos > 3R (26-55) 
oT /o 20 \T T- 02 


which again yields the classical Dulong-Petit value. 
In the low-temperature limit where « = 0/7 > 1, we can approximate 
D(a) by extending the upper limit in (26-53) to infinity; we obtain 


u = 3RT+ foe: (26-55a) 


e* — j 


The integral can be evaluated by a series expansion in e~* < 1 and then 
by use of the relation 


oe) 3 
D(a) ~ > | a (26-56) 
x 0 


m n! 
{ re Wdxr= (26-57) 
0 


qtt} 


which applies when zn is a positive integer: 


0 3 o 3-2 00 o@) 
| = = -| ve se -| zo7% d( > em) 
0 C= 0 1 =e 0 m=0 


=> | ae dx =3!> - 

p=ldg p=1 Dp 

The last sum in (26-58) is a Riemann zeta function with the value 74/90. 
Substituting (26-58) into (26-56), we obtain 


- 


(26-58) 


4 
D(a)~—— (a> 1) (26-59) 
5a? 
and when we evaluate (26-52) for a kilomole we obtain 
9 3°RT* 
~=RO aoe 26-60 
eS Sas or nore 
Ou 127'R (z) 
c= {—) ~—I- ase 26-60b 
& C) 5 0) ij ( 


Thus the low-temperature heat capacity is proportional to JT? and vanishes 
as T— 0, in agreement with the third law. The 7* dependence at low 
temperature is not as rapid a decrease as the exponential decrease given by 
the Einstein theory and is experimentally very well verified for simple 
ionic crystals. This result does not agree with the low-temperature heat 
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capacities observed for metals for which c, ~ T; we shall see later that 
this can be accounted for as a specifically quantum mechanical result 
arising from the free electrons in a metal. 

If we look back at (26-52) as well as at the specific formulas given by 
(26-55b) and (26-605), we see that the Debye theory essentially predicts a 
heat capacity which should be a universal curve when plotted as a function 
of 7/©. Thus, by trying to fit experimental values to this curve, we can 
obtain an estimate of the value of © characteristic of a given crystal. We 
also see from (26-51) and (26-48) that 


N\* | 
O~y,~ (~) é (26-61) 
where ¢.is an appropriate average speed of the elastic waves. In the 


theory of elasticity, it is shown that 
é ~ (volume compressibility) ~ “ (26-62) 


An example of such a relation is given for fluid systems in (I: Exercise 
18-5). Therefore we would expect qualitatively that an incompressible or 
‘hard’ crystal would have a large value of ©, while © would be small for a 
‘soft’? crystal. The few selected data given below which compare ©, 
obtained from the experimental heat capacity and ©, calculated from the 
elastic constants tend to verify this general prediction: 


0, 0, 
Diamond 1800 1871 
Aluminum 396 399 
Copper 313 329 
Lead 86 72 


These temperatures are measured on the absolute temperature scale. 
The agreement is surprisingly good considering the very simplified 
assumptions which were made by Debye. 


26-6 Thermal radiation 


We could get some of the results of interest to us from the Debye 
theory, but it is preferable to begin anew. We saw that the basic problem 
is the calculation of the energy density per unit frequency interval given by 
(25-37). Rather than use kT for the average oscillator energy, we should 
now use the expression (26-36). In the discussion of black-body radiation, 
it is customary to drop the zero point energy 4hv, as it would lead to an 
infinite (although constant) energy density when integrated over the 
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whole frequency range since the possible frequencies extend from 0 to oo 
as far as we know. In this manner we obtain the Planck distribution law 


8zrhy" 
c8( het = 1) 
which agrees well with experiment and represents the beginning of quan- 


tum theory. 
For low frequencies, hv/kT < 1, (26-63) becomes 


u(y, T) = (26-63) 


u(y, T) ~ (26-64) 


Sav*kT 
rosa 
which is the Rayleigh-Jeans law (25-38) we previously obtained. For high 

frequencies, hv/kT > 1, (26-63) is approximately given by Wien’s law: 


3 
u(v, T)= a ees 


(26-65) 
The general dependence of u on 7 as given by Planck’s law is shown in 
Fig. 26-6. The location of the frequency »,, for which u is a maximum can 
be obtained from (26-63) by setting du/dy = 0; the resulting condition 

can be written as 
(3 — xje* = 3 (26-66) 


with x = hy/k T. The solution of (26-66) is most easily found by successive 
approximations; the result is that x = 2.84, so that 


Hie — TG (26-67) 


which is known as Wien’s displacement law and was discovered experi- 
mentally by Wien before the formulation of Planck’s theory. Equation 


u(v, T) 


Vm v 


Fig. 26-6 
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(26-67) shows that the frequency at which the maximum energy density 
occurs increases as the temperature increases; this effect accounts 
qualitatively for the observed change in color of objects as they are 
heated. 

We can find the total energy density U/V by integrating (26-63) over all 
frequencies; if we set x = hv/kT and use (26-58), we obtain 


9) 5,474 
J =| iGDins east 
Ved 15(hc)? 
which is the empirical Stefan-Boltzmann law; we see that the constant of 
proportionality o is a universal constant. 

We can now obtain other thermodynamic functions of the radiation. 
The entropy per oscillator is given by (26-38) with 6/T = hv/kT; this is 
the appropriate form to use since the normal modes are distinguishable, 
differing as they do by their directions of propagation and polarization as 
indicated by our ability to associate each mode with a definite (and 
different) point in Fig. 25-3a. Using (25-35), (26-38), (26-63), and (26-68), 


(26-68) 


we obtain ; 
S = P Sav dy hy|/T kl 1 —hv/kT 
vj, Lemp Se) 
4Tr3 (ao 
ny ee la | 2 In (1 — e7*) dz (26-69) 
VT (hey Jy 
If we use the series expansion 
ndi-g=—>dH (26-70) 


n=1 Nn 
we can evaluate the integral in (26-69) by integrating term by term, and we 
find that Pe 0 1 fe 
| vindl—e*drt=—>-]| ae" dz 
0 


n=1N 0 
8) 1 a} 
HY Se 26-71 
2, n' 45 ( ) 


with the use of (26-57) and (26-58). If we substitute (26-71) into (26-69) 
and use (26-68), we find that Au 


SsS=— (26-72) 
3T 
The Helmholtz function is 


F=U-—TS = —3U (26-73) 
so that the pressure can be found from (11-65), (26-73), and (26-68) to be 


- -(4) =: (2) =1(2) 
P aV)r 3 \aV/r 3 \V 
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so that the equation of state is 
pV=3U (26-74) 


which is the same as we previously found by thermodynamic methods in 
Exercise 10-5. 
Finally, the Gibbs function of the radiation is found to be 


G=U+pV—TS=0 (26-75) 


The significance of this result will become clearer when we reconsider 
this whole problem from another point of view. 


Exercises 


26-1. A material is composed of atoms having a magnetic moment p» which 
may be oriented in the direction of, or opposite to, a field B with energy —uB 
and +B respectively. Find the paramagnetic moment and Curie constant. 
Show that the equation of state obtained for this system leads to a Weiss 
ferromagnetism which is compatible with the third law. 

26-2. The quantum energy levels of a rigid rotator are €, = [/( + 1)A?]/8?m’‘a’, 
where j = 0,1,2,.... The degeneracy of each level is g; = 2j)+1. Find 
the general expression for the partition function, and show that at high tem- 
peratures it can be approximated by an integral. Evaluate the high-temperature 
energy and heat capacity and compare your results with (25-24). Also find the 
low-temperature approximations to Zp, U, and C,. 

26-3. A system is composed of subsystems each of which has only two possible 
states. Suppose that the number in the upper state is found to be greater than 
the number in the lower. How would you describe the state of the whole 
system? Plot the temperature as a function of the ratio mypper/Mower- 

26-4. Show that the next term in the series for c, gives the formula 


saelpas! ) Ov. 
as 20\T) ° 560\T 


rather than (26-555). 
26-5. Derive the Debye heat capacity for a two-dimensional crystal. Similarly, 
find the Stefan-Boltzmann radiation law for a two-dimensional space. 


27 Identical particles 


Our fundamental results need a further modification because of the 
recognition that elementary entities such as electrons, atoms, etc., are 
indistinguishable in principle. {n our calculation of the number of complex- 
ions in a given distribution given by (21-4), we counted all complexions 
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which differed by an interchange of the identical subsystems as actually 
different; that is, we assumed that we could distinguish among the 
subsystems. However, this is not possible for truly identical quantities, 
and we now need to go back and take this feature into account. We shall 
find that all our previous calculations were sufficiently accurate under the 
conditions for which we have applied them so far. 

This recognition of the fact that the elementary units are indistinguish- 
able is often described as another correction due to quantum mechanics. 
This is not strictly true, of course; the necessity of such a step was first 
strongly emphasized when statistical mechanical calculations were first 
made for quantum mechanical systems. As a matter of fact, the necessity 
of correcting for indistinguishability had already been discussed by Gibbs 
about twenty years before the development of modern quantum mechanics. 


27-1 Introductory considerations 


While we are carrying out the correction discussed above, we shall, 
however, simultaneously take another quantum effect into account. For 
certain types of systems, there are restrictions on the number of particles 
that can simultaneously be in the same quantum state. The most famous 
example of this type of restriction is the Pauli exclusion principle for 
electrons, which states that no more than one electron can occupy a 
given state. It has been found that one can divide all particles into two 
classes which are characterized by the possible values of n,—the number of 
particles in the ‘th quantum state of the particle. If there are no restrictions 
onn,,sothatn, = 0, 1, 2, 3,..., we obtain Bose-Einstein statistics. If the 
particles obey an exclusion principle like that for electrons so that n, = 
0, 1, we obtain Fermi- Dirac statistics. It should be emphasized that this 
conventional terminology is somewhat misleading because the basic 
ideas of our calculations are unchanged; only the specific methods of 
calculating the thermodynamic probabilities need to be modified. 

We should also point out that, although our previous method of 
calculation in Secs. 21-1 and 21-3 gave the correct answer, we used 
throughout an assumption which is not justified by the results; that is, 
we assumed that n,; was large enough that Stirling’s formula (21-12) 
could be used to approximate n,!. We can estimate the value of n, fora 
typical situation from (21-33) and (22-10); we see that the largest value of 
n, for a monatomic gas is about 


N Nh® 


oh ae 27-1 
Z, V(2amkT)? en 


Nn; 
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If we evaluate (27-1) for helium under standard conditions, we find that 
n, =~ 1/240,000; in other words, in u-space only about one cell out of 
each 240,000 even has a helium atom init. Therefore ,; is generally very 
small rather than very large as was assumed in the use of Stirling’s formula 
to obtain (21-14); the basic reason why this problem has arisen is that 
initially there was no necessity for using a cell size as small as h’ which we 
now know to be an essential requirement. 

We can easily avoid this difficulty by a slight change of procedure. 
Since the fundamental cell in w-space is so very small, the energy of a 
particle in a given cell is very little different from that in neighboring cells. 
Therefore we can collect the cells into groups such that the energy of a 
particle is practically uniform within the group and the number N, of 
particles in the jth group is quite large. For convenience, we shall choose 
the same number g of cells in each group although we shall see that this is 
not necessary. Since the total number of cells in w-space is M, we have 


Number of groups = m (27-2) 
& 


Now we can do all our calculations in terms of the groups, and the 
numbers involved will be large enough to justify completely the use of 
Stirling’s formula. 


27-2 Bose-Einstein probability 


Let us represent the N, identical particles by N; identical zeros and let 
us also number the g cells so that they can be represented by the symbols 
iy, ig,...,%,. We now imagine mixing the i’s and the zeros and then 
arranging them in any sequence; for example, 


1, 0i,00i3i,1;000i,i 01 .0igt15 o 8 e (27-3) 


If we now require that the first symbol be an i, we can assume that the 
zeros represent particles which are located in the cell represented by the 
preceding i. For example, in the sequence above, cell 5 has three particles, 
cell 2 has two, cells 1, 7, and 8 have one each, while the others shown 
contain none. 

The number of complexions which can be obtained from the N, particles 
in the g cells of the group is clearly equal to the number of arrangements 
like (27-3) which we can make with these symbols. Since the first symbol 
must be an /, it can be chosen in g ways. The remaining g + N, — 1 
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symbols can be arranged in (g + N; — 1)! ways. Therefore the number of 
sequences such as (27-3) is 


g(g + N; — 1)! (27-4) 


Although each of these sequences is a complexion, there are many 
duplications in (27-4). For example, it is clear that every complexion can 
be represented by one like (27-3) in which the cells are arranged in 
numerical order; therefore permuting the i’s is unnecessary and, since 
this can be done in g! ways, we must divide (27-4) by g!. Similarly, the 
zeros are indistinguishable, and hence their N,! possible arrangements do 
not correspond to different complexions. Therefore, if we let w,; be the 
thermodynamic probability given by the number of different complexions 
which are possible for the jth group of cells, we find from (27-4) that 


ae g(g + N; — 1)! (27-5) 


g! N;,! 


The probability W for the whole system is obtained by combining the 
probabilities for the independent groups according to (16-3), so that 


W = W\W,°°* W; Beene Wig (27-6) 
and therefore 
M/g 
InW=)>Inw, (27-7) 
j=1 


Using (21-13) and (27-5), we obtain 


Inw,=Ing+(g+N,;—-—lDIng@t+N,;-—1)—ging—N,InN,; +1 


~(g +N, In(g + N;)—ging —N,InN, (27-8) 
since g > 1 and N, > 1. If we let 
n, = Ns (27-9) 
g 


be the average number of particles in a cell of the jth group, we can use 
(27-9) to eliminate N,; from (27-8); the result is 


Inw,; = g(n, + 1) Ing(1 +2,) — gIng — gn, In gn, 
= g((n, + 1) Indl +2,) — 7, Inn,] (27-10) 


If we substitute (27-10) into (27-7), we obtain 


Mi/g 
InW= > g[(n; + 1) In(1 + n,) — n, In n,] (27-11) 
j=l 
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Since the bracketed term in (27-11) is the same for each of the g cells of the 
group by the definition of n,, each term in (27-11) of the form g multi- 
plying a bracket equals the sum of the bracket over each of the g cells of 
the group; therefore (27-11) can actually be written as a sum over all the 
cells: 


M 
InW =) [(n, + 1) In(1 + n,;) — n, Inn] (27-12) 
i=1 


This result no longer contains the group size and involves only quantities 
characteristic of the cell. 


27-3 Fermi-Dirac probability 


Before we go on to use (27-12), let us calculate the equivalent quantity 
for the Fermi-Dirac case in which an 7, is restricted to the values 0 and 1. 
We could continue with our sequence of zeros and i’s with the additional 
restriction that two zeros cannot be adjacent, but it is more convenient to 
start over. 

In order to distribute the N; particles among the g cells, we can put the 
first in any of the g cells, the second in any of the g — 1 remaining cells, 
the third in any of g — 2, etc.; in all, we can distribute them in the 
following number of ways: 
ee ee ee 
g(g — Ig — 2) (8 —N; + 1) ( — ND! 


In this process we have identified the particles as first, second, etc., but 
this is not possible; accordingly we must divide (27-13) by the N;! 
possible permutations of the identical particles in order to obtain the 
number of complexions w,: 


(27-13) 


t 
(i es (27-14) 
N,!(g — N,j)! 


By proceeding as before and using (21-12) and (27-9), we can approximate 


w, as 
g Nj g— Nj 

m= (FE) (a 

e/ \N,/ \g—N; 


= (1 _ n,)**°n, Ns (27-15) 


so that 
Inw; =(N; — g) Ind —a,) — N, Inn, 


= ge[(n, — 1) In(1 —n,) — 2, Ina] (27-16) 
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Therefore we can also write In W obtained from (27-7) as a sum over the 
cells: 
InW= > [(n, — 1) In(1 — n,) — n, In nj] (27-17) 


27-4 State of maximum probability 


Equations (27-12) and (27-17) can be combined into the single equation 
InW= > [((n, + 1) In(i 4 n,) — n, In n,) (27-18) 


if we always choose the upper sign for the Bose-Einstein case and the 
lower sign for Fermi-Dirac; we shall follow this convention throughout. 

As in Sec. 21-3, we want to find the state of maximum probability and 
identify it with the equilibrium state. As before, our distribution n, is 
subject to the conditions of constant number of particles N and constant 
energy U given by (21-7) and (21-8): 


N = > n,=const.. U = > en; = const. (27-19) 


Consequently, the virtual variations 6n, are restricted by equations 
identical to (21-26) and (21-27): 
> on, = 0, > €; On, = 0 (27-20) 


The corresponding first order variation in (27-18) which vanishes for the 
maximum of In W is 


dlInW=0=) bn, Indl tn) +(n, + (=) 
i l+n, 
— on, Inn; — on, | 
= > dn, In (2 + 7 (27-21) 


t 


We multiply (27-20) by the respective Lagrange multipliers « and # and 
subtract from (27-21) to obtain 


> on,| (+ + 7 —a- Be, = 0 (27-22) 
i n, 
Setting the coefficient of each dn, equal to zero, we get 


In (+ if 7 ee (27-23) 


Nn; 
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so that the distribution of maximum probability is given by 


1 
= 27-24 
ie a | 
The distributions for the two separate cases are therefore 


1 


(Bose-Einstein) n, = pet Be 1 (27-25) 
oe 1 

Fermi-Dirac n, = ———— 27-26 

) eet (27-26) 


In general, the «’s will be different for the two distributions. 
The multipliers « and # can in principle be determined from the two 
conditions (27-19); thus, if we use (27-24), we find that 


1 
N=. > 27-27 
2 et thei FY] ( ) 
&. 
U = > ———_ 27-27b 
x et thei = | ( ) 


These results are seen to differ from the Boltzmann ones by the appearance 
of the +1 in the denominators. 
It is convenient to define a quantity Q by 


ngd=rF > Indl + e * Fei) (27-28) 

Therefore 
Oper = II 1 — ene Bei (27-29a) 
Orn =TT +e") (27-29b) 


We find from (27-28) that 
dln (F)(—e)e 7 F* E; 
ding =F > ae ere = — > 


op i Fe Fei ¢ et hei FY 
and therefore we see from (27-27) that 
enone (27-30) 
op 
Similarly we can show that 
re he (27-31) 
B ae, 
ies ee (27-32) 
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Thus the function @Q plays a role for these results which is similar to that of 
the partition function Z, of our previous discussions. 
We define our entropy by (21-51), and therefore 


S=kinw, =k > [(n,+ IInd +n,)— 1, Inn] (27-33) 


because of (27-18); this sum is to be evaluated with the distribution 
(27-24) corresponding to maximum probability. 
If we substitute (27-24) into (27-33) and use (27-27) and (27-28), we find 
that 
S = BkU+k In Q + Nko (27-34a) 


= = +kinQ + Nke (27-34b) 


since we shall show shortly that 8 = 1/kT, as would be expected. The 
Helmholtz function as obtained from (27-345) is 


F =U—TS = —kT\n Q — NkT« (27-35) 


27-5 Boltzmann limit 


In order to see the connection between these results and our previous 
ones, let us consider the case 


e°«l1, e>1 (27-36) 
Therefore 
In(l Fe * FF) ~ Fe 2 Fe (27-37) 
and (27-28) becomes 
In@~we*yde"= eZ, (27-38) 


because of (26-9). 
Substituting (27-38) into (27-30), (27-31), and (27-32), we obtain 


= ae tly ee N =e “Zo 
op B de, 
so that 
pea N yo yalnZ ,  N@InZ (97.39) 
Lo op Bde, 


which are identical with the earlier equations (21-32), (21-36), and (21-37), 
so that we are back to the case of Boltzmann statistics; this also shows 
that 8 = 1/kT. 
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We could, of course, have seen this directly from the distribution 
(27-24) under the condition (27-36), since then 


nese eee 


which is exactly our earlier starting point (21-30). Consequently, we have 
justified our previous results by the more exact calculations of this chapter 
in which we have used Stirling’s formula more legitimately. 

We see from (27-38) and (27-39) that we also have 


In QO c= e "Zo — N 
so that 
a= In Z, — In N 


When these expressions are substituted into (27-345), we obtain 


Z* 
N! 


which is the same as (21-53d). Thus our calculations which now have 
taken into account the identity of the particles have automatically intro- 
duced corrected Boltzmann counting in the Boltzmann limit. 

We now estimate the physical conditions which correspond to the 
expression (27-36) for the validity of the Boltzmann distribution. Combin- 
ing (27-1) with the first equation in (27-39), we can estimate e* for an ideal 


monatomic gas by V(2nmkT) 

Nh? 
This is exactly the quantity we evaluated previously so that for helium 
under standard conditions e* ~ 240,000. Consequently, we see that 
classical Boltzmann statistics is a very good approximation under these 
fairly typical circumstances. 

It would be of interest to determine the conditions for which e* ~ 1 or 
even e* < 1, for then we could expect to find significant deviations from 
the Boltzmann case. We see from (27-40) that generally the conditions are 
high density, small mass, and low temperature. In particular, the electron 
has a very small mass, and we might expect these deviations to become 
important for electrons. Although (27-40) can be used as an approxi- 
mation, the value of « can in principle always be determined from (27-27a). 

A “non-degenerate”’ gas is one to which we can apply the “‘corrected”’ 
Boltzmann statistics developed in Chapters 21 and 26. If this is not 
possible and we must use the more exact results summarized in (27-24) and 
subsequent equations, the gas is called “degenerate.’’ We have seen that 
generally ordinary gases under ordinary conditions are non-degenerate, 
while electrons, for instance, may be highly degenerate. 


Spots = = + Nk + Nk In Z)— Nk In N = = + kIn 


e ~ 


(27-40) 
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Exercises 


27-1. Repeat the calculations leading to (27-12) and (27-17), but use a different 
number of cells in each group and thus show that the same results are obtained. 

27-2. For a monatomic gas, estimate the energy difference between neigh- 
boring cells in “-space which will be of significance, and thereby justify our 
assumption that N; and g can be chosen to be large while the energy within a 
group will be approximately constant. 


28 Deviations of ideal gases from the 


Boltzmann limit 


Before we consider some specific applications of our results, we shall see 
how the thermodynamic properties of the monatomic ideal gas composed 
of independent particles deviate from those found in the Boltzmann limit. 
These functions can be found by approximating the sums involved by 
integrals. This procedure will be satisfactory for any reasonable con- 
ditions except for the very degenerate Bose-Einstein gas which will be 
considered separately. 


28-1 General expressions for thermodynamic functions 


If we define the density of states c(e) so that the number of states with 
energy between e and e + de is c(e) de, we can express the sum over 
states of a quantity F(e,) as an integral over energy by 


E Fe) =|“ Fleee) de (28-1) 


Since we are interested in ideal gases, we require c(e) for a free particle in 
a box which is large enough that the energies of the states are almost 
continuously distributed. Comparing (26-2) and (25-31), we see that we 
can use the result (25-34) which we have already found for the number 
dN, of standing waves in a given frequency interval by making the sub- 
Stitutions: 


yve—>e, dy— de/2e 


(28-2) 
c?—> h?/2m, g dN, — c(e) de 
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where g is a possible degeneracy factor which gives the number of different 
states which correspond to each translational state of energy given by 
(26-2); the need for g is analogous to the necessity of multiplying dN, by 
the number of possible polarizations of a given wave as we did in (25-35) 
and (26-46). After the substitutions listed in (28-2) are made, we obtain 


c(e) de = CV ede (28-3) 
where 
4% 
C= ame (22 (28-4) 
h? 
Applying (28-1) and (28-3) to (27-27) and (27-28), we obtain 
Nac _vede_ (28-5) 
ee ae 
oO 8 
U=C | ede (28-6) 
5 ere FI 
ing= #C| ein (1 F & **) de (28-7) 
0 
If we introduce the dimensionless variable zx by 
E 
x= pe = — 28-8 
| pe = (28-8) 
and let 
V 34 
Zp= CL (28-9) 


be the partition function for an ideal monatomic Boltzmann gas according 
to (26-29), we can use (28-4) to write (28-7) in the form 


In QO = Zpyx(a) (28-10) 
where 
x(a) = F 8 | Jaln(1 F &**) dx (28-11) 
V7 Jo 
Using (27-30), (28-9), and (28-10), we find that 
U = 8kTZ,y(«) = 2kTIn Q (28-12) 
Similarly, we find from (27-35), (11-65), and (28-10) that 
pee (28-13) 
V 
so that we can also write 
Ing= Be (28-14) 
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which gives us a physical interpretation of the function Q@. Combining 
(28-12) and (28-14), we find 


The last result is independent of the kind of statistics involved and hence 
is a universal result for ideal gases; we have already found this to be 
valid for the classical case, as can be seen from (22-12) and (22-15). 

If we substitute (28-14) into (27-35) and use (11-4), the Gibbs function 
G is found to be 


G =F + pV = —NkTa (28-16) 
and therefore 
G 
= — —= —f-— 28-17 
a er po ( ) 


so that « has the physical significance of va directly proportional to 
the Gibbs function per particle. When we recall the result (26-75) for 
thermal radiation, we see that « = 0 in this case, and then (27-25) becomes 


1, = ——— 28-18 
eb | anne 
If the thermal radiation is interpreted as being composed of ‘‘particles’’ 
called photons, each of which has the energy hv, the total energy in each 
state of a given frequency is 
hy 


ov | (28-19) 


n,hy = 
Except for the zero point energy, (28-19) is the starting point (26-36) for 
the derivation of Planck’s law (26-63) and is essentially the basis for 
describing photons as Bose-Einstein particles. We also see by the deriva- 
tion of (27-24) from (27-22) that the significance of « = 0 is that the con- 
dition of constant number of particles is not applicable, the basic reason 
being that the number of photons need not be conserved since the photons 
can be changed in number and frequency by absorption, by emission, and 
by scattering. 


28-2 Nearly non-degenerate gases 
Using (26-70) and (18-9), we find that 


x(a) = ¥ + 285 > (£1)"e™ =|" Jae" dx 
Ja Tn=1 n 


n+1 .—na@ — 2a 
=> (a =e(ore Gt] (28-20) 


n=1 n 
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The deviations from the Boltzmann limit will be calculated only to the 
first correction term in order to illustrate the method and to determine the 
qualitative differences between the two classes of particles. From (28-10) 
and (28-20), we find 


s e* 
InQ=~ gZze “(1 + =r (28-21) 
and therefore (27-32) yields 
= e 7 7 e 2 
Ne sZn(¢ “+ <F) = gZpe ‘(1 + a) (28-22) 
so that 
e*= GED) ~ aN: ( + a (28-23) 
[1+(e/2%)] gZp 2 


However, e-* = N/gZ, to first order and this can be substituted for e~* 
in the right side of (28-23); in this way the series expansion for e~* which 
is correct to second order terms is found to be 


Z N ] N 
e 7~ —}1 +o (+)| 28-24 
Zp ie gZp 


and can be used to eliminate « from other equations. 
For example, if we substitute (28-24) into (28-21) and keep only cor- 
rection terms linear in N/gZ,, we find that 


1 N 1 N 
no~n[tF (5) [13% (2) + 


~ itt (ZY (28-25) 
2 gZp 
which, when combined with (28-14) and (28-9), gives the equation of state: 
3 
er a1 F 57 (} 5 8H 
NkT 2° \gZp 2°gV(2amkT) 


In essence, we have obtained the virial expansion of the equation of state 
since (28-26) is in the general form (23-6). Equation (28-26) shows that 
first order deviations from the ideal gas equation of state pV = NkT are 
equal and opposite for the Bose-Einstein and Fermi-Dirac ideal gases; 
this simple relation no longer holds when higher order correction terms 
are included. 

Similarly, we find the energy from (28-12) and (28-25) to be 


3 
v=. NeT| F | = Nk| T# ge a | (28-27) 
2° oZ 1; 2 2° egV(2armk)? T 
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The second form is useful for calculating the heat capacity: 


3 1 ( N )| 
C,=|—)] =-Nkit+—(— 28-28 
° = vy 2 ‘ 27 \oZe ( 


Both these expressions approach the Boltzmann values $NkT and 3N& in 
the limits of high temperature and low density; the deviations from the 
Boltzmann limits are again found to be equal and opposite for the two 
cases. It is clear that we could go onand calculate the other thermodynamic 
functions in the same fashion in order to find the complete description of 
almost non-degenerate ideal gases, that is, gases which are given by the 
Boltzmann limit as a first approximation. 


28-3 Degenerate Bose-Einstein gas 


We shall briefly consider the Bose-Einstein gas for the case in which it 
is the most different from the Boltzmann limit. According to (27-36), this 
degenerate state will correspond to the smallest possible value of «. Since 
n, cannot be negative, we see from (27-25) that « + Be; > 0. If the lowest 
energy of the particle is ¢,, then | 


a> —Be (28-29) 
Therefore we are interested in the properties of the gas as « approaches 
its limiting value —fe,; hence we shall assume that 


a+ pe, «1 
and also that 
a + Be, K Ble, — &) 
where e, is the next higher energy. 
From (27-19) and (27-25), we therefore find that 


1 1 


m2 aay ae ag 
1 1 
= ettber | et ther +B le2—e1) = + . 
1 


~ 


at pe, * ePlez—e1) _ 
1 ( a+ fe, ) 1 
— 1 --+ CEE eed + eee lad 
a + fe\ flere 
On the other hand, 


+ ° 
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T. T 
Fig. 28-1 


In other words, in this limit of extreme degeneracy all the particles tend to 
occupy the lowest energy state, practically none being found in the higher 
energy states. This phenomenon is known as the Einstein condensation. 

If the effect is investigated in more detail, it is found, for example, that 
the heat capacity is continuous but has a discontinuous slope at a transition 
temperature T,, much as is shown in Fig. 28-1; this behavior is sometimes 
called a lambda transition because of the similarity of this figure to a 
capital lambda. We can see from our classifications in Sec. 15-3 that such 
a behavior would also be one of the expected characteristics of a third 
order phase transition. The transformation of liquid helium to liquid 
helium II, which occurs at 2.19 degrees absolute, is thought to be an 
example of a Bose-Einstein condensation because the quantum mechanical 
properties of He* nuclei require that they be described by Bose-Einstein 
statistics. 

The ordinary condensation of a vapor into a liquid is a condensation in 
configuration space, because the particles tend to collect within limited 


Density Density 


Fig. 28-2 
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values of the coordinates as illustrated schematically in Fig. 28-2a. The 
Einstein condensation, on the other hand, is a condensation in energy (or 
momentum) space, with no particular restrictions on the coordinates other 
than that the particles stay in the volume V; this effect is illustrated in 
Fig. 28-25. 


Exercises 


28-1. Obtain (28-3) by beginning with (26-2) and by using the method we 
used to find (25-34). Compare the general features of (28-3) with (18-20). 


28-2. Show that 
GC... 5 ] N 
e-3|!+m(z,)| 
[Hint: Recall (9-21).] 
28-3. By comparing with the discussion of the second virial coefficient in 
Sec. 23-4, show that (28-26) can be interpreted as demonstrating that there is, 


in effect, an attraction between Bose-Einstein particles and a repulsion between 
Fermi-Dirac particles. 


29 Free electron theory of metals 


The most important application of Fermi-Dirac statistics is to systems of 
electrons, since they obey the Pauli exclusion principle. Drude first made 
the suggestion that electrons in a metal act like gas molecules and partici- 
pate in the thermal equilibrium and also are the carriers of the electric 
current. The first calculations of metallic properties based on this general 
classical model were fairly successful and thereby tended to increase the 
acceptance of the model; examples of these calculations are given in 
([: Sec. 34-3). However, this model initially failed completely in pre- 
dicting the heat capacity of metals. If the electron were described by 
classical Boltzmann statistics, each electron should contribute 2k to the 
heat capacity according to the equipartition theorem result (25-22). If we 
assume that there is about one free electron per metallic atom, the con- 
tribution of the electrons to the molar heat capacity would be c,, = ¢R. 
If we added this to the Dulong-Petit value (25-28) for the lattice c,,, we 
would find the total molar heat capacity c,, to be 


Coe =F Cy) 5 Cue = 3R + aR = oR (29-1) 


Metals, however, obey the Dulong-Petit law quite well and have a heat 
capacity of about 3R. Consequently, it was initially thought that for some 
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reason the electrons did not contribute to the heat capacity although they 
were necessary to account for the electrical conductivity. 

About 1928, Sommerfeld suggested that the electrons should actually 
be described by the newly developed Fermi-Dirac statistics rather than by 
the classical Boltzmann results. It was shown that calculations based on 
these ideas were able to resolve many of the early difficulties. 


29-1 General considerations 


We shall use a somewhat oversimplified picture, which, however, does 
lead to useful results which can be obtained with comparative ease. We 
assume initially that the electrons in the metal can be treated as an ideal gas 
in a box whose volume V is that of the metal and that they have a constant 
(zero) potential energy in the box and infinite potential energy at the walls. 
In doing this, we are assuming in effect that the interactions of the 
electrons with each other as well as their interactions with the metal ions 
which are spaced periodically throughout the volume can be regarded as 
averaging out to a uniform potential corresponding to no net force on an 
electron. 

If we use our previous formula (27-40) to estimate e*, we can see why 
we might expect to have to treat the electrons as a degenerate Fermi-Dirac 
gas. If we assume about one free electron per atom, the density will be 
N/V = 10%, and we find that (27-40) becomes 


% 
(Se oa 


The condition for applicability of Boltzmann statistics is e~ > 1; we see 
from (29-2) that this corresponds to temperatures large compared to 
about 10° degrees absolute. Therefore, at ordinary temperatures, e* < 1 
and the electrons are highly degenerate. To put this another way, we can 
say that at ordinary temperatures the electrons are comparatively at 
absolute zero; accordingly, most of their properties can be investigated 
by considering series expansions about their values at T = 0. 

We also need to take into account the degeneracy factor g which arises 
from the quantum mechanical properties of electrons. Just as transverse 
electromagnetic waves can have two states of polarization, electrons of a 
given translational state can have two possible values of a coordinate 
describing an intrinsic angular momentum or “spin.’’” We need not go 
into the precise meaning of electron spin here, but simply accept the fact 
that we must take g = 2; thus the number of states in a given energy 
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range is doubled, and therefore instead of (28-4) we must now use 


4g 
C= anv (=) (29-3) 


in formulas such as (28-3) through (28-7). 
It is also convenient to define the Fermi energy ¢ or Fermi level by 
a= —fC (29-4) 
We see from (28-17) that ¢ is the Gibbs function per particle. Then, for 


example, we can write (28-5) as 


Neonode ee | n(e) de (29-5) 
9 ee) 4 4 0 


where n(e) de is the number of particles with energies between e and 
e+ de. Using (28-3) and (29-5), we can put this number in the form 


n(e) de = Cre S(e) de = c(e)f(e) de (29-6) 
where 
f= a (29-7) 


can be interpreted as the fraction of the total number of possible states 
c(e) de which are actually occupied by electrons, or as the probability that 
the state of energy « will be occupied. In principle, (29-5) determines the 
value of the Fermi energy. 


29-2 Properties at absolute zero 


We let ¢, be the value of the Fermi energy at absolute zero. As T— 0, 
8 — oo and then 


if e<t, && 0, fol 
if e>Ly, e+ 0, f—-+0 


according to (29-7). This dependence of f(e) on « is shown in Fig. 29-1. 
We see that there is a discontinuity in the probability of occupation of the 
states—all the states are completely occupied up to the limiting Fermi 
energy Cy) and none of the states is occupied for greater energy. The 
system of electrons as a whole is in its lowest possible energy state, but 
because of the exclusion principle there are individual electrons in particle 
states of finite energy even at T = 0. 


(29-8) 


264 = Introductory Topics in Theoretical Physics 
We can determine ¢, from (29-5) and (29-8) as follows: 


fo _ 9) 34 
N= c| Je de = 3 Coo (29-9) 
0 
Therefore, if we also use (29-3), we find that 
3N\* ih? (28 )" 
So oo 2m \8nV ae 


which shows that ¢, depends only on the density of electrons. 
Similarly, the zero point energy U, can be found from (28-6), (29-8), and 
(29-9) to be 


So 34 54 
U7. = c| e* de = 3CU,* = #NC, (29-11) 
0 


so that the average energy per electron is fy. The zero point pressure 
; Po can be obtained from (28-15) and 
(29-11); the result is 


= pall ey 
° 5 \V) 15m \8rv 
(29-12) 


These quantities are not negligible. 

So For example, if (29-12) is evaluated for 

Fig. 29-1 copper, it is found that py ~ 4 x 10° 

atmospheres. This large zero point 

pressure corresponds to the electrical attraction between the electrons 

and metallic ions since this is what keeps the electrons within the metal in 
spite of the large value of po. 


29-3 Properties for finite temperatures 


We shall evaluate the quantities of interest by approximate methods 
which will be valid as long as KT & o, that is, 


Blo > 1 (29-13) 
If we use (29-10), we see that (29-13) is equivalent to 
h? ( 3N ) 
T< ——]| #z 10° degrees 29-14 
mk \8nV : oe 


which will certainly be valid for any situation we shall want to consider. 
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The equation which determines ¢ is (29-5), which we can also write as 


N=C { f(e),/é de (29-15) 
0 
If we integrate (29-15) by parts, we obtain 


N= F cre] = c| a eas 
3 0 0 3 de 


= -c| 5 Ae y (29-16) 
0 3 de 


with the use of (29-7). Many quantities of interest in this field can be 
written in the form 


I -| Lhe de = -| F(e) ff de (29-17) 
o dé 0 de 

where F(e) vanishes when ¢ = 0; this result enables us to devise a useful 

series expansion for J. 

The general appearance of f(e) for T ¥ 0 is shown in Fig. 29-2a. We 
see that f is either almost 1 or zero except for a small region about ¢; 
consequently, the derivative of fis almost zero everywhere except near {, as 
shown in Fig. 29-26. Therefore it is appropriate to expand F(e) in a series 
about the Fermi energy. 

We introduce the dimensionless variable x = B(e — €) so that 


[a (29-18) 


e+1° de (e*+1) 
We also find that (29-17) becomes 


f= 


I =|" F(z + *) er (29-19) 
—Be B/ (e7 + 1)? 

: _ af 
de 

] ‘1 | 

| | 

| 

| 

; ; : 
(a) (b) 


Fig. 29-2 
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Since we expect ¢ to be not very different from fo, then Bf >> 1 because of 
(29-13) and we can replace the lower limit of integration in (29-19) by — 00. 
If we expand F in a power series about ¢, we obtain 


x dF d°F 
r(c+2)=ro+2(#) 41S (46) 4-- rae 
(c+ 2) =Fo+s (F 5a (Se) + (29-20) 
and, when (29-20) is substituted into (29-19), we obtain 
a (=) 
I = F(C)I, ey 2) Wp vee 29-21 
(5 +3(¢ sm (Sa) + (29-21) 
where 
I =| Lee (29-22) 
: — (e” + 1)? 
- 2ze* dx 
I,= —— = 0 29-23 
: [- (e* + 1)? ( 
because 
e e” 


(e? + 1)? = (e” + 1)? 
making the integrand of J, an odd function of z. Also, 


pa oe) xer dx _ 9) xe * dx 
7 Jao (e® + 1)? o (lite)? 
= 2) we (1—2e7+ 3e%*—-:-)dz 
=4(1 2 at; a --| = 24(2) = = (29-24) 


with the help of (26-57) and the definition of the zeta function. Substi- 
tuting these results into (29-21), we obtain 


dF(é) = 2 aay : 
b= in FO He) de FQ +o =r (= =) + (29-25) 


as our desired approximation. 
Applying this result to (29-16), we see from (29-17) that, when J = N, 


F(e) =2ce%, TF = & 
3 de? 2,/e 
If we use these results in (29-25) and also use (29-9), we obtain 
2 C 


2 36 % P 
N=-CC,*° =-C +Dr 
3 Co oo ( ew; 


Je 
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so that 


and therefore 


cool -E (2a: -2 (ET) mm 


is Our expression for the Fermi energy as a function of temperature. 
In order to calculate U, we compare (28-6) and (29-17) and we see that, 
when J = U, 


dF act, F=2 cer, Cee ee 
de 5 de 2 
and (29-25) becomes 
g= =u" i+? ree us = (#2) (29-27) 


We find that C = 5U,/2¢,* from (29-11) and may be used in (29-27). In 
the bracket of (29-27), ¢ can be replaced by ¢, because this bracket already 
involves a second order term. If we use (29-26) in addition and keep only 
second order terms in k7/f, so that (1 + 2)" =~ 1 + nz, we find that 
(29-27) becomes 


v= olf) f+ (A 
= of (JF OY] 


= Ui +? recs = (AZ) (29-28) 
Co 
The heat capacity of the system of electrons then is found to be 
2 
Cate ane (<7) ~T (29-29) 
dT 2405 


with the use of (29-11). The ratio of the molar heat capacity to the 
Dulong-Petit value of 3R is therefore 


Cy elec = vu (2) (29-30) 


Cy D-P 6 
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and, since kKT/Cy « 1 by (29-13), we see that ordinarily the electrons make 
only a negligible contribution to the heat capacity of a metal. We have 
now satisfied the earlier objection to the electron theory of metals for 
which the classical equipartition theorem predicted too large a contri- 
bution from the electrons. 

The situation is quite different, however, at extremely low temperatures 
where the heat capacity of the lattice is proportional to T* as given by the 
Debye theory result (26-605). Since the electron contribution is linearly 
proportional to T according to (29-29), we see that the electrons will be 
the source of practically all the heat capacity once the temperature is low 
enough to make the term linear in 7 dominant over the T? term. This is 
the general behavior found for the heat capacity of metals; at low enough 
temperatures, c, ~ T as previously mentioned near the end of Sec. 26-5. 

We can also calculate the low-temperature entropy of the electrons from 
(29-29) in order to show that the third law of thermodynamics (13-1) is 
now satisfied : 


T , 2 
s=| CodT = nk (AT) 4+ (29-31) 


The results we have obtained for a finite temperature can be understood 
and interpreted more easily in terms of our next calculation, in which we 
wish to find the fraction of the electrons at T # 0 which have been excited 
above the Fermi energy 5. If we let dN be the number whose energy is 
greater than ¢y, we find from (29-5) and (29-9) that the fraction of interest 
is 


a) oe ra) 16 
ON _C[*_Vvede_ _ 3 i) (2 + Boy" de O59. 39) 
B 


N Node eheO4 4 2(BLo)* Jett) e* + 1 


We are going to determine only the lowest order approximation to (29-32) 
which we can obtain by replacing the lower limit of integration by zero 
because of (29-26) and by taking (x + Bl)* ~ (80)% = (BC,)%; we 
therefore find that 
Lee | ee (=) In2=104%2 — (29.33) 
N — 2Boo C 


0 


oe +1 2 


bo 


Therefore the fraction kK7/€y very nearly equals the fraction of the electrons 
which have been thermally excited above the absolute zero state, so that, 
in effect, only this small number of the total electrons are responsible for 
the properties of the system. For example, (29-29) is approximately equal 
to the classical heat capacity of a system of M(K7/C,) effective electrons 
rather than of the number WN actually present. 
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29-4 Thermionic emission 


When a metal is heated, it is found that some of the electrons are able 
to escape through the surface; the resulting current is called the thermionic 
emission current. We have already mentioned that our assumption that 
the electrons move about in the interior as if there were no net force 
acting on them is equivalent to an assumption of constant potential energy 
in the interior and which we have taken to be zero. At the surface of the 
metal, strong attractive forces balance the electron pressure, so that the 
potential energy must increase very rapidly and then become constant 
again outside the metal where the electron can again be regarded as free. 
The model we shall accordingly adopt is that represented by the potential 
energy curve of Fig. 29-3. Therefore E, is the total difference in energy 
between the lowest state of an electron inside and one outside. At T = 0 
the energy states are occupied up to ¢, as indicated by the shading; the 
difference 

$= Ey— b, (29-34) 


is called the work function of the metal. 

When the electron energy becomes sufficiently large, the electrons are 
able to escape and we wish to calculate the resultant current density. The 
energy is e = p?/2m; the velocity component perpendicular to the wall is 
p./m. Since the current density of the electrons in the ith state is the 
product of the charge density p; and the velocity according to (6-2), the 
component of the total current density perpendicular to the wall is 


1e= Erte E(7)(%) aie 


since n, is the total number of particles in the ith state and e is the electron 


_ 


Inside Outside 


Fig. 29-3 
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charge. We can convert (29-35) to an integral by multiplying it by 


2dQ, _ 2dp, dp, dp, dx dy dz 

he h® 
The factor 2 arises because of the spin degeneracy; hence there are two 
total states for each cell in the w-space. If we substitute (27-26), (29-4), 
and (29-36) into (29-35), integrate x, y, and z over the total volume JV, 
and note that an electron cannot escape unless it has a velocity component 
in the z direction and a kinetic energy at least equal to Eo, so that p,?/2m > 
E,, we obtain 


fee DP, dp, dp, apy (29-37) 
Al (2mE,) 4 eB (p?/2m — ¢) + 1 


We can put this into a more convenient form by noting that 
2 
ieee (=) mn 
Op, \2m Op, 
which enables us to write (29-37) as 


" Qe (@ 2 2 de/Op,) dD, 
J= 7 | ape| do, ol 


(2mE >) e8(p?/2m—f) + 1 


2e [" : {° de 
= — dp,| d ————_—— 29-38 
h? J—o 7 _ i Eot(pat-+py!)/2m eF(@—§) 4 4 ( ) 


The integral over e in (29-38) becomes 


(29-36) 


00 e Ble—8) de 1 2, 2 
$= = In [1 + oP Bo 8) Bins toy 2m) (29-39 
1 + e Fle) B 1 ¢ ) 
At ordinary temperatures, 
BE, — 6) = B(Eo — So) = BE > 1 (29-40) 


so that e-** « 1 and we can use the approximation In(1 + x) ~ x for 
x € 1 in (29-39); when this 1s done, (29-38) becomes 


J.~ 2e obo { - o7 Bea /2m dp, { ‘7 eBay" /2m Py 
so that, if we use (18-9) and (21-60), we finally obtain 
2 
I= = T2¢0 ORT (29-41) 


which is known as the Richardson equation. 
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It has been shown experimentally that the velocity distribution of the 
electrons which have left the metal is approximately a Maxwell distri- 
bution. This may seem surprising at first because the electrons inside have 
a Fermi-Dirac distribution; however, the observed distribution can be 
obtained quite easily from our model. The number of electrons in the 
energy range de at e is given by (29-6). This equation applies also to those 
electrons which leave the metal and for which ¢ > E,. If we let «’ =e — 
E, be the kinetic energy of the electrons outside the metal, their distribution 
as obtained from (29-6) is 


_ Cle’ + Ey)* de’ Cle’ + E,)* de’ 


= ehle’'+Eo-o) 4 4 a ehle'+o) 4 4 (29-42) 


n'(e’) de’ 
which can be approximated with the use of (29-40) and «’ « E, as 


n'(e') de’ ~ C./E, & 9¢-*" de’ = C' e ** de’ (29-43) 


where C” is a new constant for a given temperature. Equation (29-43), 
giving the distribution of the energies e«’ observed outside the metal, is 
exactly the classical distribution function. 


29-5 Thermal and electrical conductivities 


We shall discuss these non-equilibrium phenomena for a metal in a brief 
and simplified manner which, however, will yield results which differ only 
by small numerical factors from those obtained from a more exact theory. 

In our discussion of transport phenomena in a gas in Chapter 19, we 
found a formula for the heat flux Q given by the first expression of (19-17). 
If we use (25-2) and (17-12), we can write 


»(P) am M26 (OT) Nee (PT) (27) 

Oy/o V aT \dy/o V \dy/o ~=VV \dy/o 

which, when substituted into (19-17) and compared with (19-18), gives us 
the more general expression for the thermal conductivity K: 


K= gue (29-44) 
3V 
Using e = 4mu’, we can find a from (29-6) as 


ii = . [ ° (2) ne) de = = (2)’ [ ire (29-45) 


m 
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Since the average speed w is very large even when T = 0 and changes very 
little with temperature, it will be sufficient for our purposes to evaluate u 
at T= 0. By using (29-8) and (29-9), we find that 


VE 7S ve) 
ips (=) | eee (=) = (29-46) 
20°" \m/ Jo 4\m 4 
where uy is the speed of an electron with the Fermi energy. Using (29-29), 


(29-44), and (29-46), we obtain 
we NIk?T 


K = (29-47) 
4Vmu, 


Before discussing (29-47), let us turn to the calculation of the electrical 
conductivity. In the absence of an electric field, the electrons have random 
directions of motion so that the average z component of the velocity 
vanishes according to (16-14’). Since u7, = 0, there will be no average 
current. If there is now a field E in the z direction, each electron of charge 
e has an acceleration 

eE 
a,=— 

m 


(29-48) 


by (5-1) and (6-46). If we assume that the average effect of a collision is 
to give the electrons a random direction of motion, and if 7 is the average 
time between collisions, we see from (29-48) that by the time the next 
collision occurs an average electron will have acquired a velocity com- 
ponent in the direction of the field given by u, = a,7. During the time 
between collisions, its average velocity along E is therefore 


i, = far = (29-49) 


The average current density J, in the direction of E is then obtained 


from (6-2) as r E Ne 
J, = pil, = (*2) (=) = ( : “)E = oE (29-50) 
V/ \2m 2Vm 


where a is the electrical conductivity. Therefore we have 


oe Ne’r_ Nel _ 2Ne’l 
2Vm  2Vmi-—- 3: Vu, 
if we take + > //a and use (29-46). 


Both K and o depend on the ratio //uy. If we calculate the ratio of the 
conductivities, we obtain 


(29-51) 


‘\r~ T (29-52) 
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which shows that this ratio should be proportional to the temperature 
but otherwise independent of the metal; the general nature of this result is 
reminiscent of (19-20). Equation (29-52) is known as the Wiedemann- 
Franz law; it was obtained empirically by them in 1853, long before the 
development of the electron theory of metals. Experimentally it is found 
that the ratio K/o is constant to within about 15 per cent for different 
metals at the same temperature. The fact that the electron theory of 
metals predicts this result (29-52) is additional strong evidence of its 
basic validity. 


Exercises 


29-1. By following the procedure which led to (29-25), show that, to the next 
higher order approximation, ¢ and U are given by 


nw (kT\ x4 (kT\* 
cam -5(Z)-2 (2) 


Sn® (kT) 4 (kT\ 
va ul + (E) eG), 
rather than by (29-26) and (29-28). 
29-2. Show that the average energy of the thermionic electrons which escape 
perpendicular to the surface is kT. 
29-3. A typical number density of electrons in a vacuum tube may be about 


10!” (meter)~%. Should one use the Fermi-Dirac results to discuss this case, 
or will the Boltzmann limit be adequate? 


BO Semiconductors 


In spite of the success of the free electron theory of metals in accounting 
for the low-temperature heat capacity and in providing a plausible model 
for conductivities, it is unsatisfactory in the sense that it does not directly 
provide a way of predicting whether a given solid will have a high conduc- 
tivity and thus be classed as a metal or whether it will have a very low 
conductivity and be classed as an insulator. Modern developments in 
solid state theory have gone a long way toward resolving this difficulty. 
In addition, materials of intermediate properties, known as semiconduc- 
tors, have become of great interest and importance. Our aim in this 
chapter will be to discuss these materials briefly as another application 
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of statistical mechanics by accepting and using the current quantum 
mechanical models. Since we shall be interested only in the properties of 
the electrons as they relate to the electrical conductivity, we shall use the 
Fermi-Dirac distribution. As we can see from our basic formulas (27-27), 
(27-28), and (28-1), the information which we require is the distribution 
and energies of the electronic states. 


30-1 Energy band picture of crystals 


One of the principal effects which is neglected in the free electron theory 
is the interaction of the electrons with the positively charged metallic ions. 
Since these ions are located at the various lattice points of the crystal, the 
potential energy of the electrons will also be periodic with the spatial 
periodicity and symmetry of the crystal structure. The basic quantum 
mechanical problem is then to solve the Schroedinger wave equation for a 
periodic potential. 

The results of this calculation show that the possible electron energies 
are very close together and can be treated as almost continuously spaced, 
much as we have been doing for the free particle in a large box. In 
addition, the energies can be divided into bands. In other words, certain 
ranges of energy values are possible for the electron moving in a periodic 
potential, whereas other ranges of the energies are not possible. Those of 
the first type are called allowed energies, and those of the latter are called 
forbidden energies; the corresponding sets of bands are allowed bands and 
forbidden bands. This situation is illustrated in the conventional schematic 
manner in Fig. 30-1 in which energy is plotted versus some position 
coordinate through the crystal. The energy states are represented by 
horizontal lines to indicate that the electron is not localized but can be 
found anywhere within the crystal; the spacing is of course much closer 
than that shown. 

The periodic nature of the crystal structure is the basic reason for the 
division into bands. Similar results can be obtained when the normal 
frequencies of coupled mechanical systems are calculated. An example of 
such a system is the weighted string consisting of equal masses spaced 
equal distances apart and coupled to each other by like forces; this system 
is discussed in (I: Chapters 13 and 15). It is found that waves cannot be 
propagated along the weighted string if the frequency is greater than some 
maximum value, but that such a frequency corresponds to an attenuated 
disturbance instead. Thus frequencies are divided into two bands for 
this system, an allowed band for which 0 < » < »max, and a forbidden 
band for which » > max. 
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The formation of bands can also be understood qualitatively from a 
different approach which is illustrated in Fig. 30-2. When the atoms are 
far apart, each atom has its own discrete energy states and the system will 
have correspondingly discrete levels; two of these levels are shown in the 
figure. For N electrons, there are a total of 2N states in each level because 
of the electron spin. As it is imagined that the atoms are brought closer 
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Fig. 30-2 
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together as the crystal is formed, the forces between the atoms begin to 
become important. The energies of the previously degenerate states are 
thus altered, and they begin to separate as shown. The net result when the 
equilibrium distance r, is reached is that a band of energies has originated 
from each level; there will be a total of 2N states in each band, and 
hence a total of 2N electrons will ‘‘fill’” the band. In any event, the exact 
values of the locations of the bands and their widths will depend on the 
particular crystal. 

When the effect of external fields on an electron in an allowed state of a 
band is investigated, it is found that the electron can be quite accurately 
described as reacting much like a free particle, except with an effective 
mass m* which is different from the mass of an unbound electron. The 
exact magnitude of m* depends on the details of the band structure, and it 
will be sufficient for our purposes to accept it as one of the state parameters. 
The situation is somewhat analogous to the problem of a sphere moving 
in an ideal incompressible fluid which is discussed in (I: Sec. 18-6); in 
this case it is also found that the net result of all the interactions of the 
sphere with the fluid is, in effect, to alter the inertia of the sphere as far as 
its behavior with respect to external forces is concerned. 

Since the electron in an allowed band has these attributes of a free 
particle, we can use the expression for the density of states obtained from 
(28-3) and (29-3), except that now é is to be replaced in the formula by the 
amount the electron energy exceeds the energy of the bottom of the band. 
For example, if we have a situation like that of Fig. 30-1 and if we consider 
the band for which ¢,” < € < «,”, the density of states in this band will be 
assumed to be given by 


2m* 
h? 


% 
c(e) de = anv ( ) (ce — e,”")% de = C*(e — &,”)" de (30-1) 


In the absence of external fields there is no net transfer of charge; 
hence for every state in an allowed band which corresponds to an electron 
moving in a given direction there must be a state corresponding to motion 
in the opposite direction. Consequently, the total current of a completely 
filled band is zero: 


[= =0 30-2 
weet ( ) 
which can also be written 
i#j 


Suppose now that the jth electron is missing from the band for some 
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reason; the total current then is 


I’ => eu, = —eu, (30-4) 
i#j 

because of (30-3). In other words, the motions of the electrons in the band 
no longer exactly cancel and there is a net current equivalent to that 
carried by a single carrier whose charge is opposite to that of an electron 
and hence is positive. Thus a band with an electron missing acts as if it 
were equivalent to a positive charge |e|; this resultant behavior is called a 
hole, and it provides us with a simplified and useful picture of the over-all 
situation by enabling us to ascribe independent existence to a hole as a 
definite and single carrier of positive charge. In the presence of an electric 
field, therefore, a hole will move in the direction of the field, whereas an 
electron will move opposite to the field. 

The completely filled band of highest energy is called the valence band; 
the next band above this is called the conduction band. Suppose now that 
each of the N atoms contributes one electron to the band; there will be a 
total of N electrons available for the 2N states of the band and, at T = 0, 
the conduction band will be exactly half full as illustrated in Fig. 30-3a. 
Even a small electric field applied to the material will be able to give the 
electrons energy because there are possible states just above the topmost 
filled one. Thus the electrons will be set in motion by an arbitrarily small 
field, and the material will have the conductivity properties of a metal. 
Suppose, instead, that each atom contributes two electrons to the band; 
there will be a total of 2N electrons and the valence band will be completely 
filled as shown in Fig. 30-35. Consequently, an applied electric field will 
not generally be able to make the electrons change their states because 
those immediately above the topmost filled one are forbidden. Therefore 
the field will not produce a current, the conductivity will be zero, and the 
material will be an insulator. 


Conduction 
Willa, } | 


Z- = 


Metal Insulator 


(a) (6) 
Fig. 30-3 
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It is clear that, if some of the electrons in the filled valence band could 
somehow be given enough energy to excite them into the conduction band, 
they could be accelerated by the field and the conductivity would be 
different from zero. At the same time, the holes left in the valence band 
could also be accelerated and could contribute to the conductivity. 
The most common method of increasing the electron energies is to raise 
the temperature. In the next section, therefore, we shall calculate the 
conductivity resulting from thermal excitation. 


30-2 Insulators and intrinsic semiconductors 


We neglect any possible effects arising from filled bands with energy 
below the uppermost or valence band; the band picture we assume 
therefore is that illustrated in Fig. 30-4 for T #0. In this figure we also 
illustrate the conventional pictorialization of electrons in the conduction 
band and the resultant holes in the valence band. 

We can find the total number of electrons N, in the conduction band by 
adapting (29-5) to this case with the use of (29-7) and (30-1); the result is 


N,=C,* | “Fee — ¢,)* de (30-5) 


where C,* is given by (30-1) in terms of the effective electron mass m,*. 
Our experience with the properties of f(e) for a finite temperature as 
illustrated in Fig. 29-2a leads us to think that the Fermi energy ¢ will be 
located somewhere between the bands, as shown in Fig. 30-4; we shall 
verify this surmise later. As a result, the most important contributions to 
(30-5) can be expected to come from near the bottom of the band, and we 


et 
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can replace the upper limit by +00. Generally, we shall also have 
é, - £> kT, so that f(e) can be approximated by e*'—*) and (30-5) 
becomes 


N,= C.* | flee — €,)% de 
~ os] eri —«,)" de 


= op pw | “sete ds 
0 
Jit C,* efit 
aT 


= ad (2m, *kT)* eS ekT (30-6) 


with the use of (18-9) and (30-1). 

The number of holes, N,, can be obtained in a similar manner if we 
realize that the free particle behavior of a hole is given by its energy 
distance below the top of the valence band; thus in (30-1) we must use 
(e, — «)” in place of (e — ¢,”)%. Since a hole represents a missing 
electron, the probability of a hole is equal to the probability that an 
electron is not occupying the state and is therefore 1 — f(e). Thus, 
instead of (30-5), we obtain 


N, = C,* | "Tt — fe, — 2) de 


a) (&— e)” de 


-0 1+ Ae or] Pe ite 


are 
a 


= C, 


= + (Qam,*kT)* elt SAT (30-7) 


Since each electron raised to the conduction band leaves a hole behind, we 
must have 
N, = N,, (30-8) 


Thus we can find ¢ by substituting (30-6) and (30-7) into (30-8) with the 
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result that 
C,* 


C= te, tet ine, 


=e, + he,+ kT In (30-9) 


é 
where 


Ey = Ee — &y (30-10) 


is known as the gap width. If T = 0 or if m,* = m,*, the Fermi level lies 
exactly halfway between the top of the valence band and the bottom of the 
conduction band. 

The number of conduction electrons and holes can be found by substi- 
tuting (30-9) into (30-6) and (30-7) or by multiplying (30-6) and (30-7) 
together and using (30-8); the result is 


N,=N,= Af (QnkT)*(m,*m,*) 4 & &/*T (30-11) 


which is much like a Boltzmann distribution except that only half the gap 
width e€, appears in the exponent. 

Our previous expression for conductivity for one type of charge carrier 
is given by (29-51). It is generally convenient to write this in the form 


o=nflelu (30-12) 


where n = N/V is the density of carriers and y is called the mobility and is 
defined as the ratio of the magnitude of the average velocity produced in 
the direction of the applied electric field to the field. For the simple case of 
(29-49), the mobility is 


A= era (30-13) 
2m 
The processes determining the mobility are generally more complicated 
than they were assumed to be in obtaining (30-13); however, (30-13) 
gives one an idea of what parameters of the system are of importance in 
determining y. 
In the present case, when there are two types of carriers, (30-12) can be 
generalized to 


o = [e| (1b, + Mnbn) = lel He + MaMe (30-14) 
because of (30-8). Substituting (30-11) into (30-14), we obtain 


om NO (ue + mak TIM tmtVi WB? —— (30-15) 
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It is customary to regard Ino as a function of 1/T; if we neglect any 
possible temperature dependence of the mobilities, we find from (30-15) 
that 

In o = const. — a — -In- (30-16) 


Since the term In (1/7) varies slowly with 1/7, the curve (30-16) is essen- 
tially a straight line with slope —e,/2k as shown in Fig. 30-5. The figure 
also demonstrates that the conductivity 
increases as the temperature increases, 
in contrast to the behavior found for 
metals. We see from (30-14) that the 
essential reason for this is that the carrier 
density increases very rapidly with tem- 
perature; for metals, on the other hand, 
the carrier density stays fairly constant 
while the average time between collisions 
decreases with increasing temperature, and 
hence the mobility decreases by (30-13). 
Equation (30-15) shows that the pre- 
dominant factor determining o is the value , 
of the gap width «,, and this has made it T 
possible to classify materials roughly into Fig. 30-5 
two groups. If e, is large, there will be 
comparatively few carriers excited and the solid will be an insulator; if 
€, is smaller, there will be an appreciable number of electrons excited into 
the conduction band and the conductivity will be larger than that of an 
insulator but not so large as that of a metal—such a material is called 
an intrinsic semiconductor. The distinction between the two is clearly a 
relative matter, although all intrinsic semiconductors will be insulators 
at T= 0. The gap width is usually given in electron volts; one electron 
volt equals 1.60 x 107!® joule. A good insulator such as diamond has 
a gap width of about 7 electron volts, while typical intrinsic semi- 
conductors such as germanium and silicon have gap widths of about 1 
electron volt; by way of comparison, kT ~ 0.025 electron volt for room 
temperature. 


30-3 Impurity semiconductors 


The conductivity of most semiconductors is not intrinsic but rather is 
due to impurities which may be atoms of a type different from those 
normally in the crystal lattice or may be due to an excess of one constituent. 
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Since such impurities will destroy the periodicity of the crystal, it may be 
possible to have allowed electron energies in the previously forbidden 
band between the valence and conduction bands. 

It has been found that these impurity levels are of two general types. 
In one type there may be electronic levels which are occupied at T = 0; 
the bound electrons in these levels are /ocalized near the impurities, and 
consequently the levels are represented by short bars on a diagram such as 
Fig. 30-6. These electrons do not contribute to the conductivity unless 
they are excited to the conduction band, as is shown for one level in the 
figure; these levels are therefore called donor levels. An electron excited 
from a donor level does not leave a hole behind because the electron was 
initially localized and was not in a filled band. Similarly, there may be 
levels which represent a state which lacks one electron of being a normal 
situation for the particular crystal. These are called acceptor levels since 
it is possible for them to be filled with an electron excited from the valence 
band. Sucha process will produce a hole, and conductivity will be possible; 
the electron shown excited to the acceptor level in Fig. 30-6 will not 
contribute to the conductivity because it is localized at this position in the 
lattice. 

Such levels as these often result when an impurity atom of valence 
different from that of the original constituent of the crystal is substituted 
into the lattice. If the impurity atom has a higher valence, there will be 
One more electron than is needed to form the valence bonds; this electron 
will generally be loosely bound and will correspond to a donor level. 
Similarly, an acceptor level can result from an impurity of valence less 
than that of the atoms of the host crystal. It is perfectly possible, of course, 
for a material to have both types of levels present, and, in addition, the 
levels need not all have the same energy—the e, or e, of Fig. 30-6. 

As an example of the type of calculations involved for an impurity 
semiconductor, let us assume that there are only donor levels present. In 


Ec 
Ae { 
tg —_- —i.— Donor levels 


—— Acceptor levels 


Fig. 30-6 
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addition, let us initially assume that the temperature is low enough that we 
can neglect any possible excitation of electrons from the valence band. 
The number of conduction electrons, N,, will again be given by the 
expression (30-6), except that ¢ will now be different. If N, is the number of 
donor levels present and all are occupied at T = 0, then N, also equals the 
number of donor levels which are unoccupied when T # 0; therefore 


Na 


N, = N,[1 — f(eg)] = 14 ame) 


(30-17) 


The Fermi level can be found by equating (30-6) and (30-17); hence 


Na _ 2V 4 C—ee) /kT 
14 ewe = paceman) e (30-18) 
We can solve this approximately for ¢ under the conditions ¢ — «, > kT, 
since then the denominator on the left becomes e~%~#/FT in the 
numerator and we find 


Ae kT E h® 
=€ — + —In | —- ————_ 30-19 
ie 21> V 2(20m,*kT)* ea) 
where 
Ace =6&,— &4 (30-20) 


represents the energy required to ionize a donor level by removing its 
electron to the conduction band. When T=0, €= €, + 4 Ae and 
therefore is exactly halfway between the donor levels and the bottom of 
the conduction band. 

The density of conduction electrons can be found from (30-17) and 
(30-19) with the result 


n, = New (Xs) e Als—ea) 
V 


3 « 34 
“3pm 


so that n, is proportional to the square root of the density of donor levels. 
Again we find half the energy difference 4 Ae entering in the Boltzmann 
type of exponential factor. The conductivity for the donor case is given by 
o = |e| un, and will be approximately described by a straight line of 
slope —Ae/2k if In o is plotted against 1/7. This property of the curve 
is the basis of a method for determining the value of Ae. 
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Intrinsic 


Ing 


Fig. 30-7 


There is always a contribution to the conductivity from the valence band, 
however, and at high temperatures it can become dominant over the 
impurity contribution because the number of electrons in the valence band 
is so much larger than the number of impurity levels that this effect can 
outweigh the disparity between «, and Ae. The result is that the In o 
versus 1/7 curve can show a change from a slope characteristic of impurity 
behavior to that characteristic of intrinsic behavior as illustrated in 
Fig. 30-7. 


Exercises 


30-1. If there are N, acceptor levels of energy «, as in Fig. 30-6, and if they 
are all empty at T = 0, show that the density of free holes in the valence band is 


4 %, 
ie = 2Na\° ( 27, *kT "g~(€q—e5)/2kT 
‘ V h2 


30-2. Calculate the total conductivity when donor levels and the valence 
band are both included, and verify the general appearance of Fig. 30-7. What 
is the approximate temperature corresponding to the intermediate behavior 
of the curve? 

30-3. Assume that only a small fraction of the possible donor levels are 
occupied by electrons at T = 0. Show that under these conditions the density 
of free electrons is proportional to (Nz/V)e—4*/*T, and compare your result 
with (30-21). What are the experimental consequences of this result? How 
might you be able to distinguish between these two possibilities in the laboratory ? 
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BI General systems and the canonical ensemble 


Our specific results so far are applicable only to isolated systems composed 
of independent subsystems, and we naturally would like to formulate 
statistical mechanics so that it can also be used to treat more complicated 
systems such as dense gases or liquids for which the interactions among the 
particles are important. 

We recall that when we originally introduced the concept of an ensemble 
we wanted to construct it in a way which would represent our knowledge 
of the system of interest as accurately as possible. Then we defined the 
microcanonical ensemble (20-23) as a suitable representation of an 
isolated system and based all our subsequent calculations on this ensemble. 
However, we realize that the concept of an isolated system is somewhat 
unrealistic for many cases, and we need to be able to discuss systems 
which are able to interact with and come to thermal equilibrium with their 
surroundings. In particular, we are interested in systems which are kept at 
constant temperature by thermal contact with a heat reservoir so that 
mutual exchanges of energy are possible. Therefore our system can no 
longer be regarded as having constant energy, and the microcanonical 
ensemble is not suitable in principle for calculating the system properties. 
The ensemble which is appropriate for describing the equilibrium prop- 
erties of a system in contact with a heat reservoir and therefore at constant 
temperature is called a canonical ensemble. Our aim is to define the 
canonical ensemble accurately and then to determine its characteristics. 
We were faced with a similar situation in thermodynamics when we tried 
to formulate the basic results as given for isolated systems in terms of the 
properties of a system free to interact with its surroundings, and many of 
the considerations we used in Sec. 11-3 will be very helpful in indicating 
how we should presently proceed. 


31-1 Derivation of the canonical ensemble from the 
microcanonical ensemble 


Figure 11-1 illustrates the basic idea of our thermodynamic treatment in 
that we regarded the isolated system as being composed of the system of 
interest plus the heat reservoir. A microcanonical ensemble is evidently 
what we want to use for the isolated system. 
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Suppose that we assume our ensemble to be composed of a very large 
number -@ of imaginary copies of the system of interest. Since these 
copies are mental constructs, they can be treated as independent and 
distinguishable and we can apply exactly the same considerations to the 
ensemble as we did in Sec. 21-1 to the N independent subsystems in 
u-space. In fact, since each member is represented by a single point in 
I’-space, the full ensemble is represented by -# points in I’-space. Thus 
we can actually regard this ensemble as a whole as itself a microcanonical 
ensemble of .€ members, and then I'-space for the system is actually the 
u-space for this ensemble; the total constant energy E of this micro- 
canonical ensemble will be 


E = AU = const. (31-1) 


where U is the average energy of each member which represents the 
system of interest. The ensemble as a whole can be represented as a single 
point in a “‘super”’ I’-space which bears the same conceptual relation to 
I'-space as I’-space did to u-space in Chapter 21; the condition (31-1) 
then represents a surface of constant energy in this super I'-space. There 
will be no problems about the applicability of Stirling’s formula, because 
we are free to choose .@ as large as we please and we can let M —> oo; 
nor do we need to employ corrected Boltzmann counting, since the 
ensemble members are distinguishable mental constructs. 

We can assume in addition that there is a very slight interaction among 
the members of the ensemble. The reason for this assumption is that, if we 
look at any particular member of the ensemble, it will be interacting 
slightly with the remaining members. In other words, a given member of 
the ensemble can be regarded as being in contact, and able to exchange 
energy, with a heat reservoir composed of the other & — | members of the 
ensemble. The parallel to Fig. 11-1 is now complete and is shown in these 
terms in Fig. 31-1. What we are interested in is the probability p, of 
finding a system in the particular cell in I’-space corresponding to the 
energy E,,; therefore 


N 
eet 1-2 
p,A(E,) v7 (31-2) 


where N,, is the number of ensemble members in this particular region of 
I'-space. 

We can proceed exactly as before and look for the state of the ensemble 
as a whole which has maximum probability subject to the conditions (31-1) 
and .M = const.; in this way we find the distribution N,, of ensemble 
members which gives maximum probability. If we review the calculations 
in u-space which we did in Sec. 21-3, we see that the problem we want to 
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Member of the Remaining M-1 Microcanonical 


ensemble members ensemble 
(system of interest) (heat reservoir) (isolated system) 


Fig. 31-1 


solve now is exactly the one we have already solved in Sec. 21-3 by using 
(21-26) through (21-28) and that the probability (31-2) which we want can 
be obtained at once from (21-33) and (21-35) by a straightforward change 
in symbols, namely, 
o bEn 
Z 


where p, is the probability of finding the system of interest in the nth 
cell of energy E,, and where 
Z= > e'" (31-4) 


(31-3) 


p,(E,) = 


is called the partition function for the canonical ensemble. 

In spite of the similarity of these formulas to those of Chapter 21, there 
are essential differences which are important to keep in mind: the 
probabilities p, refer to a cell in I’-space, and E,, is the energy of the 
system as a whole. Also, we can extend this result at once to the quantum 
case by using (22-20) and associating a quantum state n of the system as a 
whole with each cell in I’-space of volume 


AQy = h* (31-5) 


where -V is the number of degrees of freedom of the system. Then, as in 
(26-8), we can write (31-4) as 


Za eS] ») ¢ Sade" (31-6) 
cells quantum n 
states 


If circumstances allow it, we can also evaluate Z as an integral by using 
(31-5) and (31-6); the result is 


= a | e 6H gQ,. (31-7) 


where H is the Hamiltonian of the whole system and the integral is taken 
over all I’-space and is not restricted to the single energy surface given by 
(20-21) as it was before. For convenience, we shall continue to write many 
of our results as sums rather than integrals. 
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As we expect by now, the quantities characteristic of the equilibrium 
state of the system can be found from the partition function Z. For 
example, the average energy U is 


(31-8) 


We need to speak of U as the average energy because our system is able to 
exchange energy with its surroundings; we shall return to this point when 
we discuss fluctuations in the next chapter. 

In order to identify the temperature and the entropy, we can proceed 
much as we did in Sec. 21-4. In general, the energies E,, will depend on 
various parameters a,; hence we can write E,(a,,...,a,,...). If we 
define the corresponding average ~~ force F, by 


ee eae (31-9) 
Oa, on a? 
we find from (31-3) and (31-6) that 
ldiInZ 
F=-- 31-10 
a B Oa, ( ) 


Let us now consider a small change in the quantity BU + In Z; if we use 
(31-8), (31- 10), (21-56), (9-1), and (10-27), we obtain 


d(BU + 1nz) = pau + Uap + eZ ap 4 yO da, 
op a da, 
= B dQ = BTdS 
and therefore, if we define kK = 1/8T as before, we obtain 
dS =k d(pU + InZ) 
which leads to 
U 
S=7tkinz (31-11) 
The Helmholtz function F is 
F=U—TS = —kTInZ (31-12) 
so that the equation of state as obtained from (11-65) is 
je ere (31-13) 


aV 
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Thus we have succeeded in expressing our thermodynamic properties in 
terms of quantities defined for the canonical ensemble. 

Our results can be summarized in an interesting and easily remembered 
fashion by eliminating Z from (31-12) and (31-6); the result is 


e F/kr = > e7 En/kT (31-14) 


n 


31-2 Quantum mechanical basis for the third law 


Now that we are easily able to discuss properties of the system as a 
whole, we can briefly show how the third law of thermodynamics has a 
simple and natural origin. 

Our basic assumption is that there exists a lowest energy state of the 
system whose energy is E,. We want to investigate S as T — 0, that is, as 
B— oo. If we write the energy of the next higher state as FE, = FE, + A 
where A is finite, we find that 


je e@ FR e bE frees e FEW] +e FA .. -) 


Let us assume that the temperature is already so low that BA > 1 and 
hence e~*“ « 1. Then we can get a good approximation to Z by keeping 
only the first two terms: 


Z~ & BF] + e ®) (31-15) 
InZ = —fE, + In(i + e 4) ~ —BE, +e” (31-16) 
If we substitute (31-16) into (31-8) and (31-11), we find that 


U = E, + Ae #4 


= k(BA + 1l)e (31-17) 
and therefore 
lim S = lim k(BA + 1) e A =O (31-18) 
B- 0 B- 0 


This result is exactly in accord with the third law (13-1) and justifies the 
choice (31-11) for the entropy expression. 

We also see from our method of obtaining (31-18) that it depended on 
the existence of a definite non-degenerate state of lowest energy separated 
by a discrete amount from the next higher state (or states) and thus it is 
basically of quantum mechanical origin. An example of such a lowest 
state for the system as a whole is the 7 = 0 state for free electrons in a 
metal discussed in Sec. 29-2. 
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31-3 Independent subsystems 


Now let us see how the canonical ensemble results are related to those 
obtained for the less general system of N independent subsystems previously 
discussed. The Hamiltonian of the system as a whole has the form 
(21-1), so that the total energy is simply a sum of the individual subsystem 
energies é,: 


N 
E= Ye, (31-19) 
k=1 


If we let e; be the energy of the ith quantum state of the subsystem, and 7, 
be the number in the ith state, we can transform the sum (31-19) to one 
over the states, and we find that the energy E, of the corresponding 
state of the system is 

E, => ne; (31-20) 


The expression (31-6) for the partition function becomes 


La ere (31-21) 


and the terms in (31-20) and (31-21) are subject to the restriction of a 
definite number of subsystems, N, 


N= n, (31-22) 


The problem now is to evaluate the sums in (31-21). 

Let us consider first the case of Boltzmann statistics which we originally 
discussed and for which we considered the subsystems to be distinguishable. 
A given energy E, can be obtained in a variety of ways, and we see from 
(31-20) that we obtain a new distinguishable state, but with the same 
energy, for each permutation of the distinguishable subsystems in the 
given distribution n;. If we let Z43) be the partition function for the 
Boltzmann case of distinguishable particles, we can regroup the terms in 
(31-21) to obtain | 

Zam => (2 eh) (31-23) 

(nz) 
where, as in (21-6), the symbol (7;) means a sum over all the possible 
distributions (macrostates) n, compatible with (31-22). The primed sum is 
meant to include all the possibilities for a given set of n,, since they 
correspond to the same energy E, = 2,n,e,; the number of terms in 2’, 
however, is exactly the number of ways the N particles can be arranged 
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within the given distribution, and this equals Wg given by (21-4). There- 
fore (31-23) becomes 
N! 


Zuzy => —————_ € Prints (31-24) 


(ng) Ni nels nile: 


If we set x, = e~** and use the binomial theorem, (31-24) can be 
written 


N! Wea yi 
Zab) = >, ————_ SE ocd Tow e 
(n:) Ny! Ngls* nes: ‘ 
=(4,+%,+°''+24,4+--°')% 
= (> eh = Zn (31-25) 


where Z, is the independent particle partition function defined in (21-35). 
When (31-25) is used in (31-11) and (31-12), we obtain (21-64) and (21-65), 
as we expect. 

Although the last result will be useful to us very soon, it cannot be the 
correct approximation when the independent subsystems are identical 
because we assumed that the subsystems are distinguishable in order to 
obtain (31-25), and we know from Chapter 27 that we must treat them as 
indistinguishable. Hence we must reconsider the problem. 

We can again regroup the terms in (31-21) and obtain (31-23). Now, 
however, each permutation of the n, will not correspond to a new state 
because the new arrangement will be indistinguishable from the first. 
Therefore there will be only one term in the sum 2’ rather than the Wz, of 
(31-24), since each distribution n, corresponds to only one different state. 
Therefore, if we let Z; be the canonical partition function for indistinguish- 
able particles, we obtain 

Z;= > eho (31-26) 

(ni) 

instead of (31-24). The sum in (31-26) is still over all distributions n, 
which are compatible with the constraint (31-22); it is very difficult to 
carry out this sum because of (31-22), although a method devised by 
Darwin and Fowler can be used for this purpose. Instead of considering 
the general problem, therefore, we shall content ourselves with evaluating 
(31-26) approximately in the high-temperature or Boltzmann limit for 
indistinguishable particles. 

We recall our previous result after (27-1) that only about one cell in every 
240,000 is occupied by a molecule for a gas under normal conditions. 
Therefore practically all the n, are either 0 or 1, and only a negligible frac- 
tion of the cells have values of n,; as large as 2 or 3. Hence we shall not 
make much error by neglecting the latter cells and including in our sum 
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only cells for which n, is O or 1; for the cells which we do include, all the 


n;! = 1 since 0! = 1! = 1. Therefore we can approximate (31-26) as 
Zupy= Sere 
(n;=0,1) 
= i > ee e Buinies 
N! (n;=0,1) n,! n,! eee 
_ Zu) _ Zo” (31-27) 
N! N! 


with the use of (31-24) and (31-25). Thus, in the high-temperature limit, 
the canonical partition function for a system of independent indistinguish- 
able subsystems can be approximately expressed in terms of the partition 
function Z, in w-space. For an ideal gas, for example, we find from 
(31-27) and (26-29) that 

N 3N/2 
ZuBy = a (31-28) 


which one could also verify by direct integration of (31-7) throughout all 


I'-space. 
If we use (31-27) in (31-8) and (31-11), we obtain 
N 
Pas, sma gine 
op T N! 


which agree exactly with the formulas (21-36) and (21-53d) previously 
obtained by dealing entirely with u-space and corrected Boltzmann 
counting, again showing how the necessity of using (21-5) rather than 
(21-4) is due to the basic indistinguishability of atoms and molecules. 
To show the corresponding equivalence of our u-space calculations for the 
Bose-Einstein and Fermi-Dirac cases with the general canonical ensemble 
requires much more complicated mathematical methods than we have 
been using, and hence we shall satisfy ourselves with the assertion that it 
can be done and that our results of Chapter 27 can be similarly rederived. 


31-4 Theorem of van Leeuwen 


As a specific example of the application of the canonical ensemble, we 
consider a result of classical statistical mechanics which illustrates the 
generality of this approach and also is quite instructive in the way in 
which it requires that we combine many separate results of Hamiltonian 
mechanics, electrodynamics, and statistical mechanics. The theorem 
asserts that, if we treat matter in a completely classical way, the component 
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of its magnetic moment in the direction of an applied magnetic field is 
always zero. In other words, if we assume that matter consists solely of 
charged particles which can be completely described by classical mechanics, 
electrodynamics, and statistical mechanics, a material can never be 
magnetized. 

If we consider our material to consist of N charges 9, 92, ...,4y With 
position vectors r,, ro,..., Fy the total magnetic dipole moment .# of the 
system is given by " 


M = > 9 L= > 1q,(r; x u,) (31-29) 
=1 
according to (I: 38-18 and 38-20), where 1, =r, x m,u, is the angular 
momentum of the jth charge of velocity u,. 
Let us now suppose that the material is subject to homogeneous electric 
and magnetic fields E and B which, for simplicity, we assume are along 
the positive z axis. These fields can be derived from the potentials 


¢=—Ez, A=4Bxr (31-30) 


according to (I: 21-8 and 38-17). We also know from (I: 38-15) that the 
Hamiltonian of this system in the presence of electromagnetic fields is 


I 

H=) P= (P; — 9;A,;)° + ats | (31-31) 
7 L2m,; 

where the potentials are evaluated at the positions of the charges, so that 

from (31-30) we find 


b; = Pio — E2;, Ay = $By;, Ay; = $Bx;, A,y =O (31-32) 


where ¢,9 is determined solely by the mutual electrostatic forces between 
the individual charges. We are neglecting the effect of the magnetic field 
produced by one moving charge on the motion of another charge. This 
effect is really very small compared to the electrostatic forces; furthermore, 
it would make the vector potential depend on the charge velocities as well 
as on position, and then we could not describe this problem within the 
framework of Hamiltonian mechanics. 
Substituting (31-32) into (31-31), we obtain 


H= >a ray 7 Ps + 4Bq;y;)° 


+ (Dyy — $BQ525)° + Dj) +9 G50 — Ex)| (31-33) 
We also know from (I: 38-14) that 
P; — q;A; = mM, (31-34) 
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If we now differentiate (31-33) with respect to B and use (31-34) and 
(31-29), we find that 


oH 
eels > ti [—(Die + $BqQ595)¥; + Diy — $Bq ;x;)x;] 
0B j 2m; 


= Des (TjUjy — YjUjz) = M,=M (31-35) 


where -@ is the component of the total moment in the direction of the 
field for a given state of the system. 

For a system in thermal equilibrium at constant temperature, the 
pertinent quantity is the canonical ensemble average. We find from 
(31-35), (31-3), and (31-7) that 


ldinZ 
B OB 


— =| e FH dp: dzy = (31-36) 


which is similar to the result (24-4) and where 


The theorem can now be easily proved. In the absence of a magnetic 
field, the Hamiltonian is Hy = H)(p,..., Ty). In the presence of a field, 
the Hamiltonian is given by 


H — Ap, Te 9ixAy, ee eg Pv = gnAn3 | SC ry) (31-38) 


according to (31-31); the same functional form H, is used in both cases. 
The partition function in the field is then 


Z(B, 1) => [er aneet ca Te) dpe ++ day+++ (31-39) 


If we define new variables p,’ by 
Ps = Ps —95A;, FD jz = GPjq, ete, (31-40) 
then (31-39) becomes 


Z(B, T) = i | e PMPs) dp tes dass = Z(0,T) (31-41) 


where Z(0, T) is the partition function in zero field. This result follows 
because the limits of integration for the momentum components extend 
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from —oo to oo in both (31-39) and (31-41). Therefore the partition 
function is independent of the magnetic field, and (31-36) becomes 


Au 1ainZB,T) _ 18nZO,T) _, (31-42) 


which proves the theorem. 

This theorem dates back to about 1905. It was extremely disturbing at 
that time because it indicated a basic disagreement between experiment 
and the then current formulation of theoretical physics. The situation 
was finally resolved, however, with the introduction of the concept of 
electron spin and its associated magnetic moment. These non-classical 
variables cannot be included in the proof above, of course, so the theorem 
does not apply. 


Exercises 


31-1. Another method of deriving (31-3) is based on the assumption that two 
coupled systems in equilibrium will remain in equilibrium in the limiting case 
of vanishingly small coupling. Show that, if the Hamiltonians of the two systems 
are H, and H, and the coupling term is 6H, the probability of the combined 
system is 

P(A) = pC, + A, + 6H) = p\(Ay)p(Ap) 


Consider the limit 6H — 0, and then use the methods of Chapter 18 and Sec. 21-4 
to show that p(H) = const. e—?#, where B is some constant. 

31-2. Using the notation of Exercise 14-3, show that the formulas for the 
electric case which are analogous to (31-35) and (31-36) are 


P = —dH/2E and Y =kT(AlnZ/@E) 


31-3. Show that the entropy as given by the canonical ensemble distribution 
can also be written 
S = —kinp, = —k> palnpa 
n 


Compare with Exercise 21-3. 


B2 Fluctuations and noise 


The energy U as given by (31-8) was called the average energy because a 
system at constant temperature can exchange energy with its surroundings. 
One naturally wonders therefore how we are able to identify the average 
energy with the thermodynamic energy U since the very concept of an 
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average value implies the possibility that any single measurement of the 
energy will yield a value which is larger or smaller than the average. The 
good agreement which is found between the statistical average values and 
the macroscopic measurements indicates that the fluctuations or deviations 
from the average are generally quite small. Although we briefly touched 
on this matter with (21-45), we now want to discuss these fluctuations in a 
more systematic manner. 


32-1 Fluctuations in the energy 


We can define the fluctuation AF in the energy as the difference between 
a given value E and the average so that 


AE=E—E=E-—U (32-1) 


because of (31-8). The average fluctuation is zero, of course, as we can 
quickly find from (32-1) 


AE=E—E=E—E=0 (32-2) 


The mean square fluctuation is of more use and is defined as the average of 
the square of the fluctuation from the average; thus 


(AE? = (E — £E) = E?— 2EE 4+ (EY = E2?—(E)® (32-3) 


We see from (31-3), (31-6), and (31-8) that 


Ba Lb E,2 e78E 1dZ 
2% Z Op’ 
— 9 d9inZ (222) 
0B OB 0p 
2 
= oe + (EP (32-4) 
and therefore 
ae 2 
(gy = 28 Z 8 ee Y 8 rec, (32-5) 
op? op OT 


since 8 = 1/kT and C, is the heat capacity of the system. We see from 
(32-5) that the mean square energy fluctuation is determined by the 
thermodynamic parameters of the system. The relative magnitude of 
these quantities can be estimated if we consider a reasonable example. 
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Example. Monatomic Ideal Gas. According to (22-12), C, = $Nk so 
that (32-5) yields 
(AE)? = 3NK*T? (32-6) 


Although this quantity can be quite large in absolute value because it is 
proportional to N, what is of more significance is the fractional fluctua- 
tion which compares (32-6) with the system energy U = $NkT; we find 


(AE (AE 2 


(Ef U? —3N om) 
and therefore 
= a a ~ 10733 (32-8) 


These results show that the average fluctuation is so small as to be 
completely unobservable because of the Jarge number of particles 
involved. Hence we are well justified in identifying the average energy 
with the thermodynamic equilibrium energy; even though there is a 
finite probability of finding our system to have any value of the energy, 
this chance will be so small that in actual practice the energy will have a 
well-defined value. 


We also see from (32-5) that it is a consequence of the third law that the 
energy fluctuations will vanish at absolute zero; that is 


(AE)? —> 0 (32-9) 
T-0 


because the heat capacity remains finite as T 0 according to (13-20). 
On the other hand, the relative fluctuation in the energy will generally not 
vanish, for we see from (32-7) that the ratio |AE|/U is independent of the 
temperature. 


32-2 Fluctuations in other quantities 


Let us now consider fluctuations in the generalized forces F, which, 
according to (31-9), (31-10), and (31-12), are given by 


De ee ee 32-10 
Oa, B da, 0a, 


A convenient starting point is (31-14), which can be written 


> ehh Fa) = | (32-1 1) 


n 
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If we differentiate this expression with respect to a,, we obtain 


(28 _ 2, 


lade ad F, ss F, — 0 (32-12) 
da, Oa, 


as we already know. Now, however, let us differentiate (32-12) with 
respect to the parameter a,,; the result is 


0 = ( O"F OE, 
da, da, da, Oa, 


> = Es) (OF aE MEE 


er 


da, 0Oa,/\da, Qa 
O°F ( o°E ) ST 
— — Fi —F,\(F, -—F 32-13 
da, 0a,  \0a, 0a, PRE Pai he 2-12) 


because of (31-3), (31-12), and (31-14). Therefore we find from (32-13) 
that 


AF, AF, = (Fy, — FA F, — F’) 


O°E O°F 
= kT — 32-14 
es da, 0a, =m ore 


and is a generalized fluctuation theorem involving possibly different 
generalized forces. 


If A = w and if we also use (32-10), we find that (32-14) becomes 


a 2 
(AF,)* = (F, — F,)’ = Kr (== - *) (32-15) 
da,” a, 

If we apply this to the pressure so that a, = Vand F, = —p, we find from 

(32-15) that 
PET OE Op 
Ap)’ = Kr (Ze =) 32-16 
(Ap) ay? + ay ( ) 


It is sometimes more convenient to express (32-16) in terms of the iso- 
thermal compressibility «7 defined in (8-2); we then find that 


rere OE I 
Ap) = kr( _ -) 32- 
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Example. Monatomic Ideal Gas. If, for simplicity, we consider the gas 
to be contained in a cube of sides L, = L, = L, = L and of volume 
V = L’, we find from (26-2) that the energy levels of a particle are given 
by 

a a 


E = = 


NgNyNz fe y% 


(32-18) 


where a is a quantity independent of the volume. The energy levels of 
the whole system will be a linear combination of terms like (32-18), 
according to (31-20), and, since each term in the sum depends on the 
volume in the same way, we can conclude that 


A 
E, =—z 32-19 
i (32-19) 
where A is independent of the volume. Therefore 
OE 2A 2E 
fSs--—y=- 32-20 
av 3V" 3V om 
and hence the average pressure as obtained from (32-10) is given by 
pe ee (32-21) 
OV 3V 3V 
and is the same as (28-15), which we obtained by other means. 
We also find from (32-20) that 
O°E, _10A _ 10E,, 
ov? 9v% = OV? 
so that 
2 
OE _10£ _ Sp (32-22) 
ov? 9V? 3V 
Since p = NkT/V, we find that 
Op_ __NkT__p (32-23) 
OV y? V 
and, when the last two results are substituted into (32-16), we obtain 
——, 22pkT 2p’ 
Apy = —— = + 32-24 
OS a (32-24) 
and the fractional fluctuation in the pressure is therefore 
2 
aaa (32-25) 
p" 3N 


which is exactly what was found in (32-7) for the energy fluctuation. 
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Since p = nkT by (17-12) and (17-15), we can obtain the density 
fluctuation for our isothermal system directly from (32-25): 


(An)*_ 2 _ 
n* 3N 


(32-26) 


z|- 


It is evident that considerations similar to these could be applied to 
Other thermodynamic variables. In addition, higher order fluctuations 
could be found by extending the type of calculation which led to (32-14) 
by including further differentiation with respect to the parameters a,. 


32-3 ‘*Thermal noise’’ 


The title of this section describes another type of phenomenon which is 
often referred to as a fluctuation, although it does not apply to the system 
asa whole. This effect is generally calculated by applying the equipartition 
theorem to separate degrees of freedom. We shall discuss only a few 
examples rather briefly. 

The earliest example is called Brownian motion; it was first observed by 
the botanist Brown, who noted the constant motion of pollen particles 
which were suspended in water. Since these small particles are in thermal 
equilibrium with their surroundings, we can use the equipartition theorem 
result (25-22) = 

4mu? = 3kT 


which leads to an rms speed 
‘AT 


(32-27) 


For a mass of about 107?8 kilograms, u,,,, as found from (32-27) is about 
0.10 meter/second, which is about what is observed. The magnitude of 
Urs 1S virtually unobservable for more massive bodies; for example, if 
m = | kilogram, u,,,, ~ 3 < 10-3 meter/year. 

This thermal motion may be magnified, however, in some measuring 
equipment, and the useful sensitivity of the apparatus thereby limited. 
A simple example of such a situation is provided by a galvanometer 
mirror. If is the angular displacement, the elastic restoring torque in the 
Suspension is —cq, where c is a constant of proportionality. The potential 
energy associated with this angular displacement is 3cq? by (I: 5-3); 
hence 

keg? = 1kT (32-28) 
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by (25-22). Therefore there will be a random deflection superimposed on 
the steady deflection of the mirror. The rms value as obtained from 


(32-28) is = 
(Prms = | rs (32-29) 
Cc 


Accordingly, deflections less than 9,,,, will be hidden by the random 
thermal motion and cannot be detected by the apparatus. One cannot 
avoid this problem by simply increasing the sénsitivity, because we see 
that, as c is decreased, 9,,,, is increased. 

The derivation of the equipartition theorem in Sec. 25-2 shows that its 
application is not limited to mechanical degrees of freedom because it 
depended only on very general properties of the Hamiltonian description 
of the system; for example, we have already applied it to electromagnetic 
normal modes. As another example, a short-circuited inductance L 
possesses a random thermal electric current J which is found from (25-22) 
and (I: Exercise 26-2) to be given by 


ALI? = 4kT 
so that _ 
kT 
Iems — L (32-30) 


Similarly, a short-circuited condenser whose energy can be written in 
terms of its capacitance C and potential difference V as $CV? by (I: 
Exercise 23-2) has a random potential difference given by 


ees es (32-31) 
C 


These random currents and potential differences are called thermal or 
Johnson noise. This noise often limits the amplification which can be 
applied to a given signal and is consequently quite important in commu- 
nication or detecting equipment. It is important to know the general 
properties of such noise; one feature of interest is the frequency distribu- 
tion. For the simple case of a resonant circuit consisting of a condenser 
shorted by an inductance, the rms amplitudes given by (32-30) and (32-31) 
will apply to the resonant frequency. In the next section, we obtain the 
noise frequency distribution for a less restricted situation. 


32-4 Nyquist’s theorem 


Let us consider a very long lossless transmission line of length / which 
has the same kind of termination at each end. A mechanical example 
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would be a very long continuous string fastened at the ends. An acoustic 
example would be a long pipe through which sound could be transmitted. 
Electromagnetic examples would be a hollow wave guide or a coaxial 
line. In any event, we know from previous experience that the effect of the 
boundary conditions for this essentially one-dimensional problem will be 
to determine the many possible normal modes of oscillation of the system. 
Typically, the normal mode can be expressed as a standing wave with a 
form such as 

27x 


_ nar. : 
a sin a sin 277y,t = asin sin 27rv,t 


where a is the amplitude and v is an integer. If v is the speed of the wave, 
the frequency of the nth mode is related to the wavelength by (I: 15-7) 


v nv 
a, 2 ee 


vy, = 


and hence the number of modes, An, of this type in the frequency interval 
Ay is found from (32-32) to be 


Mie Ny (32-33) 
v 


The average energy of such a mode is given by the expression (26-36) for 
the average oscillator energy é,,, provided that we drop the zero point 
energy as usual. Therefore the total energy in the frequency range Av is 


U, Ay = Eose An = 21é ose Ay 
1] 


so that the average energy density per unit length is 


u, Ay = 2228 Ay (32-34) 
v 
Each standing wave can be written as two traveling waves, one in each 
direction, as shown, for example, in (I: 15-4). The energy flux in each 
traveling wave equals the average energy density times the speed of the 
wave by (I: 28-54). Therefore the power in a given frequency interval in a 
given direction P, Av can be obtained from (32-34) as 


hy Av 


P. Av = dvu, Av = Egse Ay = ehv/kT __ 4 


(32-35) 


which is the basic formula for the frequency distribution of Johnson 
noise. Nyquist’s theorem refers to the form of (32-25) which corresponds 
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to the Rayleigh-Jeans approximation to Planck’s law; that is, &,, =~ kT 
or hy/kT « 1, so that (32-35) becomes 


P, Av = kT Ay (32-36) 


and gives a noise distribution whose density is independent of the frequency 
and agrees well with experiment in those situations where it can be 
expected to be applicable. 


Exercises 


32-1. Calculate the mean square fluctuational deviation from the vertical of a 
suspended simple pendulum. 

32-2. Find the fractional fluctuation in energy of a Debye crystal at low 
temperatures. 

32-3. A good human ear can detect sound if the power is as low as 107!” watt 
for frequencies between 103 and 3 x 10% (second). If one listened at the end 
of a long tube, would the thermal noise be detectable? Repeat this estimate 
for the familiar example of hearing the roar of the sea by holding a large seashell 
to the ear. 

32-4. In what sense can thermal radiation be described as electromagnetic 
noise ? 

32-5. Show that the fluctuations in n; for Bose-Einstein and Fermi-Dirac 
systems are 

(On? = nl +n) 


What does this become in the Boltzmann limit? 
32-6. Show that (AE)? = — 03 In Z/ 06, and evaluate it for a monatomic 
ideal gas. 
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crystal, 224 

dipole in field, 209 
distribution of, 158 

electrons in metal, 267, 273 
Fermi, 263, 278, 283 
fluctuations, 296, 303 
forbidden, 274 

free, 101 

free particle, 230 

ideal gas, 77, 83, 84, 99, 258 
kinetic, 41 

Legendre transformations of, 101 
magnetic, 125 

monatomic gas, 222 

normal mode, 223 

oscillator, 220, 230, 237 

rest, 41 

rigid rotator, 210, 222, 24C€ 
thermal radiation, 245 
thermodynamic, 74 
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Energy, total, 41 
transport, 158, 166 
van der Waals gas, 112 
zero point, 233 
Energy density, thermal radiation, 226, 
228, 244 
Energy level, 232 
Ensemble, 173 
and I-space, 173, 180 
and heat reservoir, 286 
and u-space, 180 
canonical, 285 
microcanonical, 178, 181; 285 
Ensemble average, 174, 177, 183 
Enthalpy, 77, 84, 101, 114, 120 
and equilibrium, 106 
of ideal gas, 78 
Entropy, 86, 93, 192 
absolute value, 119, 136, 197 
and equation of state, 99 
and probability, 189 
and vapor pressure, 197 
as exact differential, 94 
Bose-Einstein, 253 
canonical ensemble, 288, 295 
composite system, 189 
Curie law material, 129 
distinguishable subsystems, 193 
electrons in metal, 268 
Fermi-Dirac, 253 
ideal gas, 85, 196 
identical subsystems, 190 
maximum property, 97 
oscillator, 237 
subsystem with two states, 235 
system of oscillators, 229 
thermal radiation, 245 
van der Waals gas, 114 
Entropy change of heat reservoir, 104 
Equation of continuity, 45, 175 
Equation of motion, non-relativistic, 
38, 175 
of charge, 55 
relativistic, 39 
Equation of state, 67, 102 
and energy, 97, 100 
and entropy, 99 
and heat capacities, 98 
and Joule-Thomson coefficient, 116 
and partition function, 192 


Equation of state, and virial coeffi- 
cients, 200 
at absolute zero, 124 
Dieterici, 118 
ideal gas, 73, 151, 196, 258 
Langevin, 212 
magnetic, 126, 127 
thermal radiation, 246 
van der Waals, 109 
Weiss, 213 
Equilibrium, 65, 97 
and macrostate, 183 
and phase transitions, 131 
and potentials, 103 
conditions for, 103 
Equipartition theorem, 218, 222 
and noise, 300 
relativistic, 229 
Equivalence of reference systems, 12, 
17 
Ergodic theory, 174 
Ether, 12, 14, 15 
Event, 7, 21 
Exact differential, 63, 66, 126 
and energy, 74 
and entropy, 93 
Exclusion principle, 247, 261, 263 
Extensive variables, 77, 85, 100, 123, 
126 


Fahrenheit temperature scale, 66 
Faraday disk, 16 
Faraday’s law of induction, 10 
Fermi-Dirac statistics, 247 
degeneracy, 262 
distribution, 252 
entropy, 253 
equation of state, 258 
fluctuations, 303 
probability, 250, 263 
Fermi energy, 263, 267, 273 
at absolute zero, 263 
impurity semiconductor, 283 
intrinsic semiconductor, 278 
Fermi level, 263 
Ferromagnetism, and third law, 218, 
246 
heat capacities, 216 
Weiss theory, 212 
First law of thermodynamics, 74 


First order phase transitions, 136 


energy change, 138 
Fizeau, 24 
Fluctuations, 296 ff. 
Fluid, and Carnot cycle, 87 
as thermodynamic system, 66 
element of work, 69 
Forbidden bands, 274 
Forbidden energy, 274 
Force, as 4-vector, 39 
contribution to virial, 204 


generalized, 126, 191, 288, 297 


Minkowski, 39 
transformation formula, 44 
4-acceleration, 38 
4-current, 47 
4-force, 39 
4-momentum, 39, 41 
4-potential, 47 
4-vector, 33 
acceleration, 38 
and covariance, 37 
current, 47 
divergence of a tensor, 36 
force, 39 
gradient, 36 
length of, 34 
momentum, 39 
potential, 47 
scalar product of, 33 
transformation properties, 33 
4-velocity, 34 
Fowler, 183, 291 
Free energy, 101 
and equilibrium, 105 
Free expansion, 82 
van der Waals gas, 118 
Free particle, 221 
energy of, 230 
Free path, 159 


Freezing point and pressure, 133 


Frequency of collision, 160 
Fresnel coefficient, 24 
Friction, and heat, 71 

and viscosity, 161 
Fusion, 133 


Galilean relativity, 8, 12, 15 


Galilean transformation, 7, 11, 17, 20 


Galvanometer, 300 


Index 


Gamma function, 154 
T-space, 172 
and ensembles, 173 
density, 174 
oscillator, 173 
probability, 176, 286 
“super,” 286 
Gap width, 280, 281 
Gas, constant speed, 147 
degenerate, 254 
diatomic, 210, 222 
mixtures, 198 
monatomic, 152, 166, 195 
non-degenerate, 254 
two-dimensional, 152, 167 
Gas constant, 73 
per molecule, 151 
Gas temperature scale, 72 
Gaussian error curve, 157 
Generalized displacement, 126 
Generalized force, 126, 191 
canonical ensemble, 288 
fluctuations in, 297 
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General Lorentz transformation, 29 


General relativity, 17 

Germanium, 281 

Gibbs, 173, 247 

Gibbs free energy, 101 

Gibbs function, 101, 120, 263 
and equilibrium, 105 


and phase transition, 131, 136 


ideal gas, 257 
thermal radiation, 246 
vapor, 135 

Glacier, 133 

Gradient as 4-vector, 36 


Hamiltonian, 172, 183, 191, 287 
as Legendre transformation, 62 


charged particle, 293 
ideal gas, 193 
in normal coordinates, 223 
oscillator, 172 
relativistic, 43, 56 
rigid rotator, 210 
subsystem, 180 
system, 180 
Hamilton’s equations, 172 
Heat, 69 
and friction, 71 
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Heat, as inexact differential, 71 Ideal gas, in porous plug experiment 
conduction of, 97 94 
mechanical equivalent, 71 monatomic, 152 
Heat capacity, 70, 71, 76 partition function, 195, 235, 256 
and energy fluctuation, 296 potential for, 195 
and equipartition theorem, 222 pressure, 148 
and third law, 234 relativistic, 200 
at absolute zero, 123 temperature scale, 72, 91 
crystal, 224, 239, 246 thermal conductivity, 165 
Debye theory, 239 viscosity, 164 
Einstein theory, 239 Ideal magnetic material, 126, 127, 212 
ideal gas, 76, 99, 259 Identical subsystems, 179 
magnetic systems, 127, 216 entropy, 190 
metal, 243, 261, 267 Impurity semiconductor, 281 
molar, 70 Independent subsystems, 179 
monatomic gas, 152 and canonical ensemble, 290 
oscillator, 237 Index of refraction, 24 
relation between, 76, 98, 99, 128 Indistinguishability, 247 
rigid rotator, 222 Induction, Faraday’s law, 10 
subsystem, 219 magnetic, 8 
subsystem with two states, 234 Inertial mass, 38 
van der Waals gas, 113 Inertial system, 7, 12 
Heat function, 78 Inexact differential, 64 
Heat reservoir, 88, 103, 285 and heat, 71 
and ensemble, 286 and work, 69 
entropy change of, 104 Insulator, 273, 277, 281 
Helium II, 260 Intensive variables, 85, 100, 123, 126 
Helmholtz function, 101, 192 Internal field, 213 
and equilibrium, 105 Interval, 33, 38 
canonical ensemble, 288 Intrinsic semiconductor, 281 
distinguishable subsystems, 193 Invariance, 18, 31 
gas mixture, 199 Invariant, 30, 33 
identical particles, 253 as tensor, 35 
thermal radiation, 245 charge as, 46 
Hole, 277, 279, 284 divergence as, 36 
Hydrogen, 110, 117 Laplacian as, 36 
of electromagnetic field, 55 
Ice point, 73, 117 rest mass as, 40 
Ideal gas, 72, 179, 195 scalar product as, 34 
adiabatics, 81 Inversion curve, 115 
electrons as, 262 Inversion temperature, 110, 116, 208 
energy, 77, 83, 84, 99, 196, 258 of hydrogen, 117 
enthalpy, 78 Irreversible process, 80, 82, 95 
entropy, 85, 196 Isentropic process, 85 
equation of state, 73, 151, 196 Isolated system, 178, 180, 285 
fluctuations in, 297, 299 and first law, 74 
Gibbs function, 257 entropy, 86, 96 
Hamiltonian, 193 Isotherm, critical, 111 
heat capacities, 76, 99, 259 van der Waals gas, 110 


in kinetic theory, 148 Isothermal compressibility, 67, 137 


Isothermal process, 81, 88 
and third law, 119 
Isotropic distribution, 144, 151, 153, 
157 


Johnson noise, 301 
Joule, 71 
Joule-Thomson coefficient, 115, 208 
Joule-Thomson effect, 114 
and ice point, 117 


Kelvin, 87, 99 

Kennedy-Thorndike experiment, 16 

Kilocalorie, 70 

Kilogram calorie, 70 

Kilogram mole, 70 

Kilomole, 70 

Kinematics and Lorentz transforma- 
tion, 20 

Kinetic energy, 41, 44 

Kirchhoff, 134 


Lagrange multipliers, 184, 251 
Lagrangian, and Legendre transforma- 
tions, 62 
in normal coordinates, 223 
relativistic, 56 
Lambda transition, 260 
Langevin, 209, 218 
Langevin function, 211, 215 
Laplacian, four-dimensional, 36 
Latent heat, 133, 137 
Law of atmospheres, 195 
Laws of thermodynamics, first, 74 
second, 86 
third, 119 
zeroth, 65 
Legendre transformations, 61 
and thermodynamic potentials, 101 
and third law, 123 
Length, measurement of, 22 
of 4-vector, 34 
of 4-velocity, 34 
Linde process, 117 
Liouville’s theorem, 174, 177 
Lorentz condition, 45 
Lorentz contraction, 22, 46 
Lorentz-Fitzgerald contraction, 15 
Lorentz transformation, 17, 18, 31, 32 
and covariance, 37 
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Lorentz transformation, and kinemat- 
ics, 20 
and rotations, 31 
and wave equation, 20 
differential, 27 
formulas, 20 
general, 20, 29 
successive, 29 
Loschmidt’s number, 160 


Macrostate, 181, 183 
maximum probability, 183, 187 
probability of, 182 
Magnetic dipole energy, 209 
Magnetic energy, 125 
Magnetic field, internal, 213 
Magnetic induction, 8 
of moving charge, 53 
transformation formulas, 50 
Magnetic material, ideal, 126, 212 
Magnetic moment, and canonical en- 
semble, 294 
saturation, 211 
spontaneous, 214 
total, 125, 293 
Magnetic systems and third law, 246 
Magneto-caloric effect, 128 
Mass, effective, 276 
inertial, 38 
relativistic, 40 
rest, 38, 40, 44 
Mass point mechanics, 38 
Maxwell, 14 
Maxwell-Boltzmann distribution, 194 
Maxwell distribution, 153, 194, 271 
mean free path for, 160 
Maxwell reciprocal relations, 102, 133 
Maxwell’s equations, 12, 17, 44 
for vacuum, 44 
in moving systems, 10 
in tensor form, 49 
Mean free path, 159, 162, 164 
fixed molecules, 159 
Maxwell distribution, 160 
Mean square fluctuation, 296 
Mechanical equivalent of heat, 71 
Mechanics of mass point, 38 
Meson, 29 
Metal, conductivity, 272 
energy, 267, 273 
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Metal, entropy, 268 
free electron theory, 261 ff. 
heat capacity, 243, 261, 267 
potential energy for electron, 269 
thermal conductivity, 271 
Michelson-Morley experiment, 14 
Microcanonical ensemble, 178, 181, 
285 
Microstate, 181 
Minkowski force, 39, 42, 44 
Mixture of gases, 198 
Mobility, 280 
Molar value, 70 
Mole, kilogram, 70 
Molecular distribution function, 194 
Molecular weight, 70, 152 
Mole fraction, 74 
Momentum, as 4-vector, 39 
conservation, 43, 147 
particle, 38 
transport, 162 
Monatomic gas, 152, 195, 235, 297 
energy, 222 
mean free path, 160 
transport coefficients, 166 
Motional electric field, 51 
Multipliers, Lagrange, 184, 251 
p-space, 180 
and quantum states, 231 
cell size, 198 
probability, 180, 220 


Nernst, 119, 121, 123 
Noise, 300 
Non-degenerate gas, 254 
Normal coordinates, 223 
Normal frequencies, density, 228 
electromagnetic, 226 
Normalization, 144, 154, 182, 205 
Normal modes, crystal, 223, 240 
distinguishable, 245 
electromagnetic, 226 
energy of, 223 
nth order phase transition, 137 
Number density, 146, 151, 194 
Number transport, 152 
Nyquist’s theorem, 301 


Observer, 7 
Orbital speed of earth, 13 


Order-disorder transition, 218 
Oscillator, 221 
and I’-space, 172 
distinguishable, 229, 238 
electromagnetic field, 226 
energy, 220, 230, 237 
entropy, 229, 237 
Hamiltonian, 172 
heat capacity, 237 
partition function, 219, 237 


Paramagnetism, 127, 209, 218 
Partial derivatives, transformation for- 
mulas, 59 
Partial pressure, 73, 200 
Particle, free, 221 
Partition function, 187, 191 
and equation of state, 192 
and magnetic moment, 209 
canonical ensemble, 287 
gas mixture, 199 
ideal gas, 195, 235, 256 
independent subsystems, 219 
oscillator, 219, 237 
quantum, 230 
Pauli exclusion principle, 247, 261, 263 
Periodic potential, 274 
Perpetual motion, 75, 87 
Phase, 130 
Phase space, 172 
Phase transition, equilibrium condition, 
131 
first order, 136 
lambda, 260 
second order, 217 
third order, 261 
Phase velocity, 11, 14 
Photon, 44, 257 
Planck, 87, 99, 119, 121 
Planck’s constant, 198 
Planck’s law, 244, 257, 303 
Plane wave, 11 
Point function, 63 
Polarization, in moving medium, 9 
in rotating cylinder, 16 
of electromagnetic field vectors, 228 
Porous plug experiment, 83, 94, 96, 
114, 138 
Postulates of relativity, 17, 31 


Potential, for ideal gas, 195 
four-dimensional, 47 
periodic, 274 
scalar, 45 
thermodynamic, 100, 103 
vector, 45 

Potential energy, electron in metal, 269 

Pressure, 69 
and melting, 133 
and virial theorem, 203 
critical, 111 
effective, 107 
fluctuations, 298 
gas mixture, 199 
ideal gas, 148, 299 
kinetic definition, 151 
partial, 73, 200 
thermal radiation, 245 
vapor, 131, 197 
zero point, 264 

Probability, 141, 142 
and entropy, 189 
and time, 178 
and volume, 171 
Boltzmann, 182 
Bose-Einstein, 249 
canonical ensemble, 286, 295 
Fermi-Dirac, 250 
in T'-space, 176 
in w-space, 180, 220 
macrostate, 182 
maximum, 183, 186, 188, 251 
normalized, 182 
of a hole, 279 
of electron energy state, 263 
relative, 182 
thermodynamic, 182 

Probability density, 143, 144, 148 

Proper time, 27, 33 


Quantum energy, free particle, 230 
oscillator, 230 
rigid rotator, 246 
Quantum partition function, 230 
ideal gas, 235 
oscillator, 237 
Quantum states, 230 
and u-space cells, 231 
Quasi-static process, 79 
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Radiation, thermal, 225 
Rayleigh-Jeans law, 229, 244, 303 
Real gases, 106 ff., 130 
Reciprocal relations, 102, 122 
Reduced variables, 112 
Refraction, index of, 24 
Relative probability, 182 
Relativity, classical, 8 
Galilean, 8, 12, 15 
general, 17 
of simultaneity, 21 
postulates of, 17 
special, 7, 17 
Rest energy, 41 
Rest mass, 38, 40, 44 
Rest system, 27, 29, 46 
Reversible process, 79, 88 
adiabatic, 80 
expansion of gas, 86 
Richardson equation, 270 
Riemann zeta function, 242 
Rigid rotator, 210, 246 
energy, 222 
heat capacities, 222 
Rms speed, 152, 158, 300 
Rotation of axes, 29 
and Lorentz transformations, 31, 33 


Sackur-Tetrode formula, 197 

Saturation moment, 211 

Scalar as tensor, 35 

Scalar potential, 45 

Scalar product, as invariant, 34 
of 4-vectors, 33, 43 

Second law of thermodynamics, 86 
and perpetual motion, 87 
and thermal radiation, 225 

Second order phase transition, 137, 217 

Second rank tensor, see Tensor 

Semiconductors, 281 

Sign convention for work, 69 

Silicon, 281 

Simultaneity, 21 

Sommerfeld, 262 

Spacelike interval, 33, 38 

Special relativity, 17 

Specific heat, 70 

Speed, average, 158 
distribution of, 155 
most probable, 157 
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Speed, of the earth, 13 
rms, 155, 158, 300 
Spherical wave, 25 
Spin, 262, 270, 295 
Spontaneous moment, 214 
Standard conditions, 73 
State, and I-space, 172, 180 
and u-space, 180 
equation of, 67 
in quantum theory, 230 
maximum probability, 186, 251 
probability of, 180 
steady, 146 
State function, 63 
States, corresponding, 112 
density of, 255, 276 
number in band, 276 
State variable, 65 
Steady state, 146 
Stefan-Boltzmann law, 245 
Stirling approximation, 184, 247, 286 
Stress, 161 
Sublimation, 133 
Subsystems, and canonical ensemble, 
290 
and partition function, 219 
distinguishable, 193, 290 
Hamiltonian, 180 
heat capacity, 219 
independent, 179 
probability, 220 
with two states, 232 
Susceptibility, electric, 9 
Symmetric tensor, 35 
System, thermodynamic, 66 


Temperature, 65 
absolute, 86, 90, 91, 94 
absolute zero, 91 
and conductivity, 281 
and viscosity, 165 
critical, 111 
Curie, 213, 215, 217 
Debye, 241 
empirical, 91 
ice point, 73, 117 
ideal gas, 91 
inversion, 110, 116, 208 
kinetic definition, 151 
thermal radiation, 225 


Temperature scales, 66, 72 
Tensor, electromagnetic field, 48 
Tensors, and covariance, 37 
definitions and properties, 34 ff. 
Thermal conductivity, ideal gas, 165 
metal, 271 
Thermal expansion coefficient, 67, 122, 
137 
Thermal noise, 300 
Thermal radiation, 225, 303 
as photons, 257 
energy density, 226, 228, 244 
thermodynamic functions, 245, 246 
Thermionic emission, 269 ff. 
Thermodynamic potentials, 100 ff. 
Thermodynamic probability, 182 
Thermodynamic system, 66 
Third law of thermodynamics, 119, 198 
and Dulong-Petit law, 224 
and ferromagnetism, 246 
and heat capacities, 234 
and Weiss theory, 218 
consequences, 122 
origin, 289 
Third order phase transition, 260 
Thomsen, 120 
Throttling process, 114 
Time, dilation, 21 
proper, 27, 33 
Time average, 173, 177 
Timelike interval, 33, 38 
Total differential, 59 
Total energy in relativity, 41 
Transformation, Galilean, 7, 11, 17, 20 
general Lorentz, 29 
Legendre, 61 
Lorentz, 17, 18, 20 
of acceleration, 29 
of charge density, 46 
of current, 47 
of electric field, 9 
of electromagnetic fields, 50 
of force, 44 
of 4-vector, 33 
of partial derivatives, 59 
of wave equation, 20 
Transport, of a general property, 165 
of energy, 158, 166 
of molecules, 145, 152, 167 
of momentum, 162 


Trouton-Noble experiment, 12 
Turbines, 78 
Two-dimensional gas, 152, 167 


Uncertainty principle, 198, 229 


Uncorrected Boltzmann counting, 193, 


197 


Valence band, 277 
van der Waals gas, 109, 130 
critical variables, 111 
energy, 112 
entropy, 114 
equation of state, 109 
free expansion of, 118 
heat capacities, 113 
inversion temperature, 110 
isotherms, 110 
Joule-Thomson coefficient, 116 
thermal expansion coefficient, 109 
virial coefficient, 201, 207 
van Leeuwen theorem, 292 
Vapor, 131, 138 
Gibbs function, 135 
Vapor pressure, 131 
and absolute entropy, 136, 197 
of ammonia, 138 
Vapor pressure curve, 132 ff. 
Variables, conjugate, 100, 102, 198 
extensive, 100, 126 
intensive, 100, 126 
reduced, 112 
Vector potential, 45 


Velocity, addition formulas, 23, 24, 28 


as 4-vector, 34 

isotropic distribution, 144, 151 
Maxwell distribution, 153, 194 
of thermionic electrons, 271 
phase, 11 


Velocity components, distribution, 155 


Velocity space, 145 
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Virial coefficients, 200 
general formula, 207 
ideal gas, 258 
van der Waals gas, 201, 207 

Virial theorem, 202 

Virtual variations, 186 

Viscosity, 161 
and density, 164 
and temperature, 165 
ideal gas, 164 
origin, 164 
two-dimensional gas, 167 

Volume, critical, 111 
effective, 107 


Water, 133 
Wave, electromagnetic, 10 
phase velocity, 11 
plane, 11 
spherical, 25 
Wave equation, 10, 11, 20 
Weight, molecular, 70, 152 
Weighted string, 274 
Weiss theory, 212 
and third law, 218 
as second order transition, 217 
Wiedemann-Franz law, 273 
Wien’s displacement law, 244 
Wien’s law, 244 
Work, and reversible expansion, 86 
as inexact differential, 69 
for fluid, 69 
in statistical mechanics, 191 
in thermodynamics, 68 
maximum available, 105 
sign convention, 69 
Work function, 269 


Zero point energy, 77, 233 
Zeroth law of thermodynamics, 65 
Zeta function, 242 


